Skip to content

Egor/dev 1016 change motleycrew error message to user from system for Anthropic compatibility#117

Merged
ZmeiGorynych merged 4 commits intomainfrom
egor/dev-1016-change-motleycrew-error-message-to-user-from-system-for
Mar 26, 2026
Merged

Egor/dev 1016 change motleycrew error message to user from system for Anthropic compatibility#117
ZmeiGorynych merged 4 commits intomainfrom
egor/dev-1016-change-motleycrew-error-message-to-user-from-system-for

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented Jan 3, 2026

Summary by CodeRabbit

  • New Features

    • Enhanced image handling (format conversion and PNG conversion) and a new end-to-end Google Slides image example
    • Structured output helper now accepts plain string inputs
    • Verbose callback support for agent executors to improve verbose output
  • Bug Fixes

    • Improved error output handling for Anthropic models to avoid message ordering issues
  • Tests

    • Added tests covering image conversion and Anthropic error recovery
  • Chores

    • Version bumped to 0.3.7

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

Adds CodeRabbit CI config, new image workflow example, image conversion utilities (PNG conversion via LibreOffice/Wand), Anthropic-compatible error-message handling in agents, broader input handling for structured output utility, langchain callback adjustments, tests for image utils and Anthropic output recovery, and a version bump.

Changes

Cohort / File(s) Summary
Configuration
\.coderabbit.yaml
Adds CodeRabbit YAML config enabling automated reviews for drafts and sets a YAML language-server schema.
Project metadata
pyproject\.toml
Bumps package version (0.3.4 → 0.3.7) and adds wand dependency.
Examples
examples/image_from_gslides_example.py
New end-to-end example demonstrating ImageElement → ImageData fetching, file saving, and conversion to LLM-friendly messages with error handling and backward-compat checks.
Image utilities & tests
motleycrew/utils/image_utils.py, tests/test_tools/test_image_utils.py
Adds PNG conversion helpers (convert_image_to_png), constants (SUPPORTED_MIME_TYPES, LIBREOFFICE_FORMATS), LibreOffice/Wand conversion paths, logging, and tests validating conversions and constants.
Agent message handling & tests
motleycrew/agents/mixins.py, tests/test_agents/test_langchain_output_handler.py
On error actions, record error output as a human message instead of system to avoid non-consecutive system messages with Anthropic; adds a test verifying Anthropic output-handler error recovery.
Structured output utility
motleycrew/utils/structured_output_with_retries.py
Expands input_messages accepted types to include str and normalizes string input to a single HumanMessage at runtime; docstrings updated.
LangChain agent executors
motleycrew/agents/langchain/legacy_react.py, motleycrew/agents/langchain/tool_calling_react.py
Add StdOutCallbackHandler import and pass callbacks (StdOut when verbose) to AgentExecutor; introduce RunnableConfig usage in legacy_react.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 New configs hop in, tidy and spry,
Images transform as pixels fly by,
Errors now spoken in human tone,
Strings embraced where lists once had grown,
Tests clap their paws — the code garden's shown! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: modifying error message handling in motleycrew to use 'user' role instead of 'system' for Anthropic compatibility, which is reflected across multiple files (mixins.py, test files, and agent files).
✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@ZmeiGorynych ZmeiGorynych changed the title Egor/dev 1016 change motleycrew error message to user from system for Egor/dev 1016 change motleycrew error message to user from system for Anthropic compatibility Jan 3, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_agents/test_langchain_output_handler.py (1)

159-175: Consider strengthening the test coverage.

While this test verifies that the agent completes successfully with Anthropic models, it doesn't explicitly verify that the error recovery path (converting system messages to human messages) is actually exercised. The test assumes the agent "may initially try to return directly" but this behavior isn't guaranteed.

Consider one of these approaches for more robust coverage:

  1. Use mocking or instrumentation to verify that the error recovery code path in agent_plan_decorator is actually executed
  2. Add assertions that check intermediate steps to confirm error actions were created and handled
  3. Force a scenario that definitely triggers the error recovery (e.g., by temporarily patching the agent's behavior)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 42a7043 and a7570ea.

📒 Files selected for processing (6)
  • .coderabbit.yaml
  • examples/image_from_gslides_example.py
  • motleycrew/agents/mixins.py
  • motleycrew/utils/structured_output_with_retries.py
  • pyproject.toml
  • tests/test_agents/test_langchain_output_handler.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_agents/test_langchain_output_handler.py (3)
motleycrew/common/enums.py (2)
  • LLMFramework (16-20)
  • LLMProvider (4-13)
motleycrew/common/llms.py (1)
  • init_llm (242-262)
motleycrew/agents/langchain/tool_calling_react.py (1)
  • ReActToolCallingMotleyAgent (98-235)
🪛 Flake8 (7.3.0)
tests/test_agents/test_langchain_output_handler.py

[error] 126-126: 'langchain_anthropic.ChatAnthropic' imported but unused

(F401)

🔇 Additional comments (10)
pyproject.toml (1)

3-3: LGTM!

Version bump from 0.3.4 to 0.3.6 is clean and appropriate.

motleycrew/agents/mixins.py (2)

71-74: LGTM!

The change from "system" to "human" for error messages correctly addresses Anthropic's constraint against consecutive system messages. The inline comment clearly explains the rationale.


139-142: LGTM!

Consistent application of the fix in the async wrapper. Both synchronous and asynchronous paths now handle error messages identically.

.coderabbit.yaml (1)

1-7: LGTM!

CodeRabbit configuration is valid and appropriately enables automated reviews on draft PRs and all branches. The schema reference ensures proper validation by language servers.

motleycrew/utils/structured_output_with_retries.py (3)

16-16: LGTM!

Type annotation correctly expanded to accept string input while maintaining backward compatibility.


26-28: LGTM!

Docstring clearly documents the new string input handling and explains the behavior for all three supported input types.


36-37: LGTM!

String normalization logic correctly converts string input to the expected List[HumanMessage] format before agent invocation.

tests/test_agents/test_langchain_output_handler.py (1)

151-157: No action required. The model name "claude-sonnet-4-20250514" is a valid and current Anthropic Claude model identifier documented in Anthropic's API as of January 2026.

examples/image_from_gslides_example.py (2)

131-147: Demo script structure looks good.

The main execution block effectively orchestrates all demonstrations and provides a helpful summary of the new capabilities. The overall structure of this example script is clear and well-documented.


29-29: Verify the mock patch path is effective.

The mock path "gslides_api.element.element.requests.get" targets a different module than the public import path. Ensure this path correctly intercepts the requests.get call made by ImageElement.get_image_data(). If the mock is ineffective, the code will attempt real HTTP requests instead of using the mocked response.

Comment thread examples/image_from_gslides_example.py
Comment on lines +125 to +128
try:
from langchain_anthropic import ChatAnthropic
except ImportError:
pytest.skip("langchain-anthropic not installed")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused import.

The ChatAnthropic import on line 126 is never used in the test. The test uses init_llm() to create the Anthropic LLM instance, so this import is unnecessary.

🔎 Proposed fix
-    try:
-        from langchain_anthropic import ChatAnthropic
-    except ImportError:
+    try:
+        import langchain_anthropic  # noqa: F401
+    except ImportError:
         pytest.skip("langchain-anthropic not installed")
🧰 Tools
🪛 Flake8 (7.3.0)

[error] 126-126: 'langchain_anthropic.ChatAnthropic' imported but unused

(F401)

🤖 Prompt for AI Agents
In tests/test_agents/test_langchain_output_handler.py around lines 125 to 128,
the try/except imports ChatAnthropic which is never used; remove the unused
ChatAnthropic import and its try/except block, leaving only the pytest.skip
guard if langchain-anthropic isn't required or refactor to check availability
another way (e.g., attempt importing only when needed or simply remove both
lines so the test uses init_llm() without the unused import).

Comment thread motleycrew/agents/mixins.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Fix all issues with AI Agents 🤖
In @motleycrew/agents/langchain/legacy_react.py:
- Around line 106-114: Update the comment above the callbacks logic to reference
the correct langchain-core version constraint or generalize it: replace "As of
langchain-core 0.3.81" with either "As of langchain-core 0.3.72" (the project's
minimum) or a broader "As of langchain-core 0.3+" so it matches the project's
dependency; keep the implementation that sets callbacks =
[StdOutCallbackHandler()] if verbose else None and its use in AgentExecutor
(agent, tools=langchain_tools, handle_parsing_errors, verbose, callbacks).

In @pyproject.toml:
- Line 19: The pyproject dependency pin for wand is outdated; update the
dependency line "wand" to "^0.7.0" in pyproject.toml and implement ImageMagick
security practices: ensure all image processing that uses the wand (MagickWand)
binding is moved out of the main request thread into background workers/queues
(e.g., the functions that call wand.Image() or any MagickWand usage), reject or
sandbox processing of untrusted inputs, and deploy a hardened ImageMagick
policy.xml with strict resource limits and disabled dangerous formats to
mitigate CVE-style risks.
🧹 Nitpick comments (1)
motleycrew/utils/image_utils.py (1)

120-142: Consider case-insensitive matching for consistency.

Line 138 uses source_mime_type.lower() for case-insensitive matching against LIBREOFFICE_FORMATS, but line 134 does not use .lower() when checking SUPPORTED_MIME_TYPES. While MIME types are conventionally lowercase and the calling code appears to always provide lowercase values, applying .lower() consistently would make the function more robust.

🔎 Proposed fix for consistency
-    if source_mime_type in SUPPORTED_MIME_TYPES:
+    if source_mime_type.lower() in SUPPORTED_MIME_TYPES:
         return image_bytes, source_mime_type
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7570ea and 54a2412.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • motleycrew/agents/langchain/legacy_react.py
  • motleycrew/agents/langchain/tool_calling_react.py
  • motleycrew/utils/image_utils.py
  • pyproject.toml
  • tests/test_tools/test_image_utils.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/test_tools/test_image_utils.py (1)
motleycrew/utils/image_utils.py (4)
  • convert_image_to_png (120-142)
  • human_message_from_image_bytes (145-150)
  • image_file_to_bytes_and_mime_type (153-172)
  • is_this_a_chart (175-204)
🪛 Ruff (0.14.10)
motleycrew/utils/image_utils.py

52-52: subprocess call: check for execution of untrusted input

(S603)


86-86: Do not catch blind exception: Exception

(BLE001)


109-109: Consider moving this statement to an else block

(TRY300)


115-115: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: test (3.12)
  • GitHub Check: test (3.10)
🔇 Additional comments (11)
motleycrew/agents/langchain/legacy_react.py (1)

6-6: LGTM! Imports correctly support the new callback handling.

The StdOutCallbackHandler import enables verbose output handling, and the RunnableConfig import properly supports the existing type annotation at line 60.

Also applies to: 9-9

motleycrew/agents/langchain/tool_calling_react.py (2)

11-11: LGTM! Import correctly supports the new callback handling.

The StdOutCallbackHandler import is properly added to enable verbose output with explicit callbacks.


215-225: Consistent implementation; verify version requirement.

The callback handling implementation correctly mirrors the pattern in legacy_react.py, ensuring consistent verbose behavior across both agent types. However, the same version discrepancy noted in the other file applies here: the comment references langchain-core 0.3.81 while the library context indicates version 0.3.0.

Please verify the langchain-core version requirement as noted in the review comment for legacy_react.py.

tests/test_tools/test_image_utils.py (4)

109-119: LGTM!

The test correctly validates that convert_image_to_png returns the original bytes and MIME type unchanged for already-supported formats.


122-131: LGTM!

The test provides good coverage by verifying passthrough behavior for all supported MIME types.


134-139: LGTM!

The test correctly validates the expected EMF/WMF MIME types in the LIBREOFFICE_FORMATS constant.


142-147: LGTM!

The test correctly validates the expected commonly-supported image MIME types.

motleycrew/utils/image_utils.py (4)

4-22: LGTM!

The imports are appropriate for the new image conversion functionality, and the constants are well-defined. Using sets for SUPPORTED_MIME_TYPES and LIBREOFFICE_FORMATS ensures efficient membership checks.


25-88: LGTM!

The LibreOffice conversion implementation is well-structured with proper error handling and graceful degradation. The subprocess call is safe since soffice_path comes from the system PATH (via shutil.which) and all arguments are either hardcoded or constructed from a controlled temporary directory. The broad exception catching on line 86 is appropriate here as it ensures the function never crashes and always returns a valid result.

The static analysis warnings (S603, BLE001) are false positives in this context.


91-117: LGTM!

The Wand/ImageMagick conversion implementation follows best practices with lazy importing for the optional dependency and appropriate error handling. The function gracefully falls back to returning the original bytes on any failure.

The static analysis warnings (TRY300, BLE001) can be safely ignored—the current error handling structure is idiomatic and appropriate for this use case.


175-204: LGTM!

The addition of convert_image_to_png preprocessing ensures that unsupported image formats are converted to PNG before being sent to the LLM. This is a solid improvement that aligns with the new image conversion workflow. The switch to keyword arguments on line 200 also improves code clarity.

Comment thread motleycrew/agents/langchain/legacy_react.py
Comment thread pyproject.toml
@ZmeiGorynych ZmeiGorynych merged commit 527bbb8 into main Mar 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants