Replace PaddleOCR SDK with HTTP async API due to pyyaml conflict#1
Closed
Rander7 wants to merge 8 commits into
Closed
Replace PaddleOCR SDK with HTTP async API due to pyyaml conflict#1Rander7 wants to merge 8 commits into
Rander7 wants to merge 8 commits into
Conversation
## Why This Refactoring Is Necessary PaddleOCR 3.6.0+ has migrated to a new async Job API architecture where requests are submitted, then polled for completion. The legacy sync API will be deprecated, making this refactoring critical for long-term maintenance. Benefits of using the official SDK: - **Future-proof**: Aligns with PaddleOCR's official API evolution - **Better reliability**: Built-in retry logic, timeout handling, error classification - **Reduced maintenance**: No need to manually implement poll loops and error handling - **Consistent behavior**: Same implementation as PaddleOCR's own tools (CLI, MCP) ## Breaking Changes **None for end users** - the tool interface and output format remain identical. The plugin continues to accept the same credentials and file inputs. ## Internal Changes ### Dependencies - Replaced `requests` with `paddleocr>=3.6.0` ### SDK Integration - Added `get_sdk_client()` with `client_platform="dify"` header - Added Base64 → temp file conversion (SDK requires file_path/file_url) - Added result format converters to maintain legacy output structure ### Code Simplification - Removed manual HTTP request handling (`make_paddleocr_api_request`) - Removed manual poll loops (SDK handles submit → poll → fetch) - Updated credential validation to use SDK ## Testing All three tools maintain their original behavior: - Text Recognition (PP-OCRv5) - Document Parsing (PP-StructureV3) - VL Document Parsing (PaddleOCR-VL-1.6)
Use lazy imports for paddleocr SDK to avoid requiring it for tests. Tests now mock the SDK calls to avoid importing the large paddleocr package. Key changes: - Remove top-level imports from paddleocr in utils.py and provider.py - Use lazy imports inside functions that need paddleocr - Add comprehensive mocking in tests for SDK functions - Rename project to "paddleocr-dify" to avoid name conflict Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit migrates from direct HTTP API calls to the official PaddleOCR SDK (>=3.6.0), which simplifies integration and improves maintainability. Key changes: - Use public API imports from paddleocr package instead of internal modules - Implement unified camelCase to snake_case parameter conversion - Remove unnecessary result format conversion functions - Simplify credential configuration: base_url is now optional (uses SDK default if not provided) - Update provider validation to use SDK for testing - Add manual test script for validation - Update tests to mock public API instead of internal modules User-facing changes: - Configuration simplified: only token is required for official service - base_url is optional (only needed for self-hosted deployments) - All core OCR and document parsing features continue to work as before Testing: - All 12 unit tests pass - Manual tests confirm OCR (URL and Base64) and document parsing work correctly Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The official PaddleOCR SDK has a pyyaml dependency conflict with dify_plugin: - dify_plugin requires pyyaml >= 6.0.3 - paddleocr (via paddlex) requires pyyaml == 6.0.2 This change replaces SDK calls with direct HTTP requests to the async Job API, following the exact same implementation logic as the SDK (submit → poll → fetch). Key changes: - tools/utils.py: Implement HTTP async Job API (_submit_job, _poll_job, _parse_*) - Replace client.ocr()/parse_document() with call_paddleocr_api() - Use same API endpoint /api/v2/ocr/jobs, Bearer token auth, poll strategy - Keep all utility functions (file handling, camel_to_snake) unchanged - No changes to provider.yaml config or pyproject.toml dependencies Changes: +461/-480 lines, 6 files modified Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Owner
Author
|
Created by mistake, the correct PR is langgenius#3247 in langgenius/dify-official-plugins |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Changes
tools/utils.py: Implement HTTP async Job API (_submit_job,_poll_job,_parse_*)client.ocr()/parse_document()withcall_paddleocr_api()/api/v2/ocr/jobs, Bearer token auth, exponential backoff pollingcamel_to_snake) unchangedprovider.yamlconfig orpyproject.tomldependenciesTechnical Details
/api/v2/ocr/jobsAuthorization: Bearer {token}+Client-Platform: difyTest plan
get_sdk_client()returns correct config dictbuild_ocr_options()converts camelCase to snake_casenormalize_file_input()handles URL and file inputs🤖 Generated with Claude Code