Skip to content

fix: handle non-JSON error bodies in WaterCrawl exceptions#37536

Open
ifer47 wants to merge 2 commits into
langgenius:mainfrom
ifer47:fix/watercrawl-exception-json-decode-error
Open

fix: handle non-JSON error bodies in WaterCrawl exceptions#37536
ifer47 wants to merge 2 commits into
langgenius:mainfrom
ifer47:fix/watercrawl-exception-json-decode-error

Conversation

@ifer47

@ifer47 ifer47 commented Jun 16, 2026

Copy link
Copy Markdown

Summary

  • WaterCrawlBadRequestError.__init__ called response.json() unconditionally, which leaks JSONDecodeError when the server returns a non-JSON error body (e.g. HTML from a proxy/gateway)
  • Wrap JSON parsing in a try/except for (ValueError, json.JSONDecodeError) and fall back to response.text when the body is not valid JSON
  • If response.text is also empty, use a generic "Unknown error occurred" message

Closes #37513

Test plan

  • Added unit tests for non-JSON error bodies in TestWaterCrawlExceptions
  • Verified existing JSON-path tests still pass
  • Manually tested with MagicMock responses that raise ValueError on .json()

🤖 Generated with Claude Code Best

ifer47 and others added 2 commits June 17, 2026 00:16
Jina credential validation delegates to the pooled HTTP client without
an explicit timeout. A slow or hanging Jina endpoint can stall
credential validation indefinitely. Add a bounded httpx.Timeout
(10s read, 3s connect) consistent with other auth provider patterns,
and update test assertions to verify the timeout is passed.

Fixes langgenius#37524

Co-Authored-By: zhipu/glm-5 <zai-org@claude-code-best.win>
…s#37513)

WaterCrawlBadRequestError.__init__ called response.json()
unconditionally. When the server returns a non-JSON error body (e.g.
HTML from a proxy/gateway), this leaked a JSONDecodeError to the caller
instead of providing a meaningful error message.

Wrap the JSON parsing in a try/except for (ValueError,
json.JSONDecodeError) and fall back to response.text when the body is
not valid JSON. If text is also empty, use a generic "Unknown error
occurred" message.

Co-Authored-By: zhipu/glm-5 <zai-org@claude-code-best.win>
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WaterCrawl crawler error responses can leak JSONDecodeError

1 participant