-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix PaddleOCR 2.9+ args, Box Sorting, and pin Transformers (Florence-2 fix) #349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix PaddleOCR 2.9+ args, Box Sorting, and pin Transformers (Florence-2 fix) #349
Conversation
|
@microsoft-github-policy-service agree |
|
Hi @ataymano, this is the fix for setting up the environment with the latest version of paddleocr (v2.9.1) - Fixed PaddleOCR Initialization: Newer versions of paddleocr have deprecated or removed the max_batch_size, use_gpu, and use_dilation arguments. Passing them was causing a ValueError on startup. I’ve updated the initialization to rely on the default argument parsing, which correctly handles GPU detection automatically. I noticed that filtered_boxes occasionally contained mixed types (dictionaries and raw lists) depending on the detection results, which caused the sorted() function to crash with an AttributeError. I added a safety check to standardize all elements to dictionaries before sorting. I verified these changes on Windows with Python 3.11 and PaddleOCR 2.9.1. The model now loads correctly and performs inference without crashing.
|
|
See #354 |
The crash in your logs is actually coming from the Florence-2 model, not Paddle. The AttributeError: 'NoneType' object... error occurs because recent versions of the Transformers library break the custom model code. I have updated requirements.txt in the PR above to pin the correct version. Quick fix - |
|
I also had to add on the utils.py attn_implementation="eager" on : |
| dashscope | ||
| groq No newline at end of file | ||
|
|
||
| groq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why groq ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, Groq has been used, since that is the reasoning model, it produces the chain of thoughts.
Example -
It generates -
<think>......</think> Tokens
You can find its use in this file -
OmniParser/omnitool/gradio/agent/llm_utils/groqclient.py


Summary of Changes
This PR addresses two compatibility issues encountered when setting up the environment with recent library versions (specifically PaddleOCR v2.9.1).
Fix PaddleOCR Initialization
Newer versions of
paddleocrhave deprecated/removed arguments likemax_batch_size,use_gpu, anduse_dilationfrom the constructor. Keeping them causes aValueErrorcrash on startup.Change: Updated
PaddleOCR()initialization to rely on default argument parsing, which correctly auto-detects GPU support.Box Sorting
Occasionally,
filtered_boxescontains mixed types (dictionaries and raw lists) depending on the detection results. This causes thesorted()function (line ~435) to crash with anAttributeErrorbecause raw lists do not have keys.Change: Added a safety check loop (
safe_boxes) to ensure all elements infiltered_boxesare standardized dictionaries before sorting.Fix Florence-2 Inference Crash
The current dependency resolution installs a version of transformers (v4.41+) that is incompatible with the custom modeling_florence2.py, causing an AttributeError: 'NoneType' object has no attribute 'shape'. Change: Pinned transformers==4.40.0 in requirements.txt to ensure stability.
Testing