Egor/dev 1016 change motleycrew error message to user from system for Anthropic compatibility#117
Conversation
📝 WalkthroughWalkthroughAdds CodeRabbit CI config, new image workflow example, image conversion utilities (PNG conversion via LibreOffice/Wand), Anthropic-compatible error-message handling in agents, broader input handling for structured output utility, langchain callback adjustments, tests for image utils and Anthropic output recovery, and a version bump. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/test_agents/test_langchain_output_handler.py (1)
159-175: Consider strengthening the test coverage.While this test verifies that the agent completes successfully with Anthropic models, it doesn't explicitly verify that the error recovery path (converting system messages to human messages) is actually exercised. The test assumes the agent "may initially try to return directly" but this behavior isn't guaranteed.
Consider one of these approaches for more robust coverage:
- Use mocking or instrumentation to verify that the error recovery code path in
agent_plan_decoratoris actually executed- Add assertions that check intermediate steps to confirm error actions were created and handled
- Force a scenario that definitely triggers the error recovery (e.g., by temporarily patching the agent's behavior)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.coderabbit.yamlexamples/image_from_gslides_example.pymotleycrew/agents/mixins.pymotleycrew/utils/structured_output_with_retries.pypyproject.tomltests/test_agents/test_langchain_output_handler.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_agents/test_langchain_output_handler.py (3)
motleycrew/common/enums.py (2)
LLMFramework(16-20)LLMProvider(4-13)motleycrew/common/llms.py (1)
init_llm(242-262)motleycrew/agents/langchain/tool_calling_react.py (1)
ReActToolCallingMotleyAgent(98-235)
🪛 Flake8 (7.3.0)
tests/test_agents/test_langchain_output_handler.py
[error] 126-126: 'langchain_anthropic.ChatAnthropic' imported but unused
(F401)
🔇 Additional comments (10)
pyproject.toml (1)
3-3: LGTM!Version bump from 0.3.4 to 0.3.6 is clean and appropriate.
motleycrew/agents/mixins.py (2)
71-74: LGTM!The change from
"system"to"human"for error messages correctly addresses Anthropic's constraint against consecutive system messages. The inline comment clearly explains the rationale.
139-142: LGTM!Consistent application of the fix in the async wrapper. Both synchronous and asynchronous paths now handle error messages identically.
.coderabbit.yaml (1)
1-7: LGTM!CodeRabbit configuration is valid and appropriately enables automated reviews on draft PRs and all branches. The schema reference ensures proper validation by language servers.
motleycrew/utils/structured_output_with_retries.py (3)
16-16: LGTM!Type annotation correctly expanded to accept string input while maintaining backward compatibility.
26-28: LGTM!Docstring clearly documents the new string input handling and explains the behavior for all three supported input types.
36-37: LGTM!String normalization logic correctly converts string input to the expected
List[HumanMessage]format before agent invocation.tests/test_agents/test_langchain_output_handler.py (1)
151-157: No action required. The model name"claude-sonnet-4-20250514"is a valid and current Anthropic Claude model identifier documented in Anthropic's API as of January 2026.examples/image_from_gslides_example.py (2)
131-147: Demo script structure looks good.The main execution block effectively orchestrates all demonstrations and provides a helpful summary of the new capabilities. The overall structure of this example script is clear and well-documented.
29-29: Verify the mock patch path is effective.The mock path
"gslides_api.element.element.requests.get"targets a different module than the public import path. Ensure this path correctly intercepts therequests.getcall made byImageElement.get_image_data(). If the mock is ineffective, the code will attempt real HTTP requests instead of using the mocked response.
| try: | ||
| from langchain_anthropic import ChatAnthropic | ||
| except ImportError: | ||
| pytest.skip("langchain-anthropic not installed") |
There was a problem hiding this comment.
Remove unused import.
The ChatAnthropic import on line 126 is never used in the test. The test uses init_llm() to create the Anthropic LLM instance, so this import is unnecessary.
🔎 Proposed fix
- try:
- from langchain_anthropic import ChatAnthropic
- except ImportError:
+ try:
+ import langchain_anthropic # noqa: F401
+ except ImportError:
pytest.skip("langchain-anthropic not installed")🧰 Tools
🪛 Flake8 (7.3.0)
[error] 126-126: 'langchain_anthropic.ChatAnthropic' imported but unused
(F401)
🤖 Prompt for AI Agents
In tests/test_agents/test_langchain_output_handler.py around lines 125 to 128,
the try/except imports ChatAnthropic which is never used; remove the unused
ChatAnthropic import and its try/except block, leaving only the pytest.skip
guard if langchain-anthropic isn't required or refactor to check availability
another way (e.g., attempt importing only when needed or simply remove both
lines so the test uses init_llm() without the unused import).
There was a problem hiding this comment.
Actionable comments posted: 2
Fix all issues with AI Agents 🤖
In @motleycrew/agents/langchain/legacy_react.py:
- Around line 106-114: Update the comment above the callbacks logic to reference
the correct langchain-core version constraint or generalize it: replace "As of
langchain-core 0.3.81" with either "As of langchain-core 0.3.72" (the project's
minimum) or a broader "As of langchain-core 0.3+" so it matches the project's
dependency; keep the implementation that sets callbacks =
[StdOutCallbackHandler()] if verbose else None and its use in AgentExecutor
(agent, tools=langchain_tools, handle_parsing_errors, verbose, callbacks).
In @pyproject.toml:
- Line 19: The pyproject dependency pin for wand is outdated; update the
dependency line "wand" to "^0.7.0" in pyproject.toml and implement ImageMagick
security practices: ensure all image processing that uses the wand (MagickWand)
binding is moved out of the main request thread into background workers/queues
(e.g., the functions that call wand.Image() or any MagickWand usage), reject or
sandbox processing of untrusted inputs, and deploy a hardened ImageMagick
policy.xml with strict resource limits and disabled dangerous formats to
mitigate CVE-style risks.
🧹 Nitpick comments (1)
motleycrew/utils/image_utils.py (1)
120-142: Consider case-insensitive matching for consistency.Line 138 uses
source_mime_type.lower()for case-insensitive matching againstLIBREOFFICE_FORMATS, but line 134 does not use.lower()when checkingSUPPORTED_MIME_TYPES. While MIME types are conventionally lowercase and the calling code appears to always provide lowercase values, applying.lower()consistently would make the function more robust.🔎 Proposed fix for consistency
- if source_mime_type in SUPPORTED_MIME_TYPES: + if source_mime_type.lower() in SUPPORTED_MIME_TYPES: return image_bytes, source_mime_type
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
motleycrew/agents/langchain/legacy_react.pymotleycrew/agents/langchain/tool_calling_react.pymotleycrew/utils/image_utils.pypyproject.tomltests/test_tools/test_image_utils.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_tools/test_image_utils.py (1)
motleycrew/utils/image_utils.py (4)
convert_image_to_png(120-142)human_message_from_image_bytes(145-150)image_file_to_bytes_and_mime_type(153-172)is_this_a_chart(175-204)
🪛 Ruff (0.14.10)
motleycrew/utils/image_utils.py
52-52: subprocess call: check for execution of untrusted input
(S603)
86-86: Do not catch blind exception: Exception
(BLE001)
109-109: Consider moving this statement to an else block
(TRY300)
115-115: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: test (3.12)
- GitHub Check: test (3.10)
🔇 Additional comments (11)
motleycrew/agents/langchain/legacy_react.py (1)
6-6: LGTM! Imports correctly support the new callback handling.The
StdOutCallbackHandlerimport enables verbose output handling, and theRunnableConfigimport properly supports the existing type annotation at line 60.Also applies to: 9-9
motleycrew/agents/langchain/tool_calling_react.py (2)
11-11: LGTM! Import correctly supports the new callback handling.The
StdOutCallbackHandlerimport is properly added to enable verbose output with explicit callbacks.
215-225: Consistent implementation; verify version requirement.The callback handling implementation correctly mirrors the pattern in
legacy_react.py, ensuring consistent verbose behavior across both agent types. However, the same version discrepancy noted in the other file applies here: the comment references langchain-core 0.3.81 while the library context indicates version 0.3.0.Please verify the langchain-core version requirement as noted in the review comment for
legacy_react.py.tests/test_tools/test_image_utils.py (4)
109-119: LGTM!The test correctly validates that
convert_image_to_pngreturns the original bytes and MIME type unchanged for already-supported formats.
122-131: LGTM!The test provides good coverage by verifying passthrough behavior for all supported MIME types.
134-139: LGTM!The test correctly validates the expected EMF/WMF MIME types in the
LIBREOFFICE_FORMATSconstant.
142-147: LGTM!The test correctly validates the expected commonly-supported image MIME types.
motleycrew/utils/image_utils.py (4)
4-22: LGTM!The imports are appropriate for the new image conversion functionality, and the constants are well-defined. Using sets for
SUPPORTED_MIME_TYPESandLIBREOFFICE_FORMATSensures efficient membership checks.
25-88: LGTM!The LibreOffice conversion implementation is well-structured with proper error handling and graceful degradation. The subprocess call is safe since
soffice_pathcomes from the system PATH (viashutil.which) and all arguments are either hardcoded or constructed from a controlled temporary directory. The broad exception catching on line 86 is appropriate here as it ensures the function never crashes and always returns a valid result.The static analysis warnings (S603, BLE001) are false positives in this context.
91-117: LGTM!The Wand/ImageMagick conversion implementation follows best practices with lazy importing for the optional dependency and appropriate error handling. The function gracefully falls back to returning the original bytes on any failure.
The static analysis warnings (TRY300, BLE001) can be safely ignored—the current error handling structure is idiomatic and appropriate for this use case.
175-204: LGTM!The addition of
convert_image_to_pngpreprocessing ensures that unsupported image formats are converted to PNG before being sent to the LLM. This is a solid improvement that aligns with the new image conversion workflow. The switch to keyword arguments on line 200 also improves code clarity.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.