Replace PaddleOCR SDK with HTTP async API due to pyyaml conflict by Rander7 · Pull Request #1 · Rander7/dify-official-plugins

Rander7 · 2026-06-05T09:14:44Z

Summary

Replace PaddleOCR SDK with direct HTTP async Job API calls due to pyyaml dependency conflict
The official SDK (via paddlex) requires pyyaml==6.0.2, but dify_plugin needs pyyaml>=6.0.3
Implementation follows the exact same logic as SDK: submit → poll → fetch

Changes

tools/utils.py: Implement HTTP async Job API (_submit_job, _poll_job, _parse_*)
Replace client.ocr()/parse_document() with call_paddleocr_api()
Use same API endpoint /api/v2/ocr/jobs, Bearer token auth, exponential backoff polling
Keep all utility functions (file handling, camel_to_snake) unchanged
No changes to provider.yaml config or pyproject.toml dependencies

Technical Details

API endpoint: /api/v2/ocr/jobs
Authentication: Authorization: Bearer {token} + Client-Platform: dify
Poll strategy: initial 3s, exponential backoff (1.5x), max 15s, timeout 600s
Result format: compatible dict format with SDK's structure

Test plan

Module imports work correctly
get_sdk_client() returns correct config dict
build_ocr_options() converts camelCase to snake_case
normalize_file_input() handles URL and file inputs
Manual OCR test with real token (requires environment setup)

🤖 Generated with Claude Code

## Why This Refactoring Is Necessary PaddleOCR 3.6.0+ has migrated to a new async Job API architecture where requests are submitted, then polled for completion. The legacy sync API will be deprecated, making this refactoring critical for long-term maintenance. Benefits of using the official SDK: - **Future-proof**: Aligns with PaddleOCR's official API evolution - **Better reliability**: Built-in retry logic, timeout handling, error classification - **Reduced maintenance**: No need to manually implement poll loops and error handling - **Consistent behavior**: Same implementation as PaddleOCR's own tools (CLI, MCP) ## Breaking Changes **None for end users** - the tool interface and output format remain identical. The plugin continues to accept the same credentials and file inputs. ## Internal Changes ### Dependencies - Replaced `requests` with `paddleocr>=3.6.0` ### SDK Integration - Added `get_sdk_client()` with `client_platform="dify"` header - Added Base64 → temp file conversion (SDK requires file_path/file_url) - Added result format converters to maintain legacy output structure ### Code Simplification - Removed manual HTTP request handling (`make_paddleocr_api_request`) - Removed manual poll loops (SDK handles submit → poll → fetch) - Updated credential validation to use SDK ## Testing All three tools maintain their original behavior: - Text Recognition (PP-OCRv5) - Document Parsing (PP-StructureV3) - VL Document Parsing (PaddleOCR-VL-1.6)

Use lazy imports for paddleocr SDK to avoid requiring it for tests. Tests now mock the SDK calls to avoid importing the large paddleocr package. Key changes: - Remove top-level imports from paddleocr in utils.py and provider.py - Use lazy imports inside functions that need paddleocr - Add comprehensive mocking in tests for SDK functions - Rename project to "paddleocr-dify" to avoid name conflict Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

This commit migrates from direct HTTP API calls to the official PaddleOCR SDK (>=3.6.0), which simplifies integration and improves maintainability. Key changes: - Use public API imports from paddleocr package instead of internal modules - Implement unified camelCase to snake_case parameter conversion - Remove unnecessary result format conversion functions - Simplify credential configuration: base_url is now optional (uses SDK default if not provided) - Update provider validation to use SDK for testing - Add manual test script for validation - Update tests to mock public API instead of internal modules User-facing changes: - Configuration simplified: only token is required for official service - base_url is optional (only needed for self-hosted deployments) - All core OCR and document parsing features continue to work as before Testing: - All 12 unit tests pass - Manual tests confirm OCR (URL and Base64) and document parsing work correctly Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The official PaddleOCR SDK has a pyyaml dependency conflict with dify_plugin: - dify_plugin requires pyyaml >= 6.0.3 - paddleocr (via paddlex) requires pyyaml == 6.0.2 This change replaces SDK calls with direct HTTP requests to the async Job API, following the exact same implementation logic as the SDK (submit → poll → fetch). Key changes: - tools/utils.py: Implement HTTP async Job API (_submit_job, _poll_job, _parse_*) - Replace client.ocr()/parse_document() with call_paddleocr_api() - Use same API endpoint /api/v2/ocr/jobs, Bearer token auth, poll strategy - Keep all utility functions (file handling, camel_to_snake) unchanged - No changes to provider.yaml config or pyproject.toml dependencies Changes: +461/-480 lines, 6 files modified Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Rander7 · 2026-06-05T09:18:52Z

Created by mistake, the correct PR is langgenius#3247 in langgenius/dify-official-plugins

Rander7 and others added 8 commits June 4, 2026 10:09

fix: update tests to work with SDK-based implementation

2afbcc2

bump: version 0.2.6 -> 0.2.7

064d7db

Merge branch 'main' into refactor/use-paddleocr-sdk

89ee252

Fix: Update credentials to use base_url (optional)

1a3d077

Rander7 closed this Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace PaddleOCR SDK with HTTP async API due to pyyaml conflict#1

Replace PaddleOCR SDK with HTTP async API due to pyyaml conflict#1
Rander7 wants to merge 8 commits into
mainfrom
refactor/use-paddleocr-sdk

Rander7 commented Jun 5, 2026

Uh oh!

Rander7 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Rander7 commented Jun 5, 2026

Summary

Changes

Technical Details

Test plan

Uh oh!

Rander7 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant