ref(langchain): Remove set_data_normalized for primitive attributes#5510
ref(langchain): Remove set_data_normalized for primitive attributes#5510alexander-alderman-webb wants to merge 1 commit intomasterfrom
set_data_normalized for primitive attributes#5510Conversation
set_data_normalized for primitive attributes
Semver Impact of This PR🟢 Patch (bug fixes) 📋 Changelog PreviewThis is how your changes will appear in the changelog. Bug Fixes 🐛Openai
Other
Documentation 📚
Internal Changes 🔧
🤖 This preview updates automatically when you update the PR. |
Codecov Results 📊✅ 13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 7.09s 📊 Comparison with Base Branch
✨ No test changes detected All tests are passing successfully. ❌ Patch coverage is 0.00%. Project has 13670 uncovered lines. Files with missing lines (180)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 25.70% 30.84% +5.14%
==========================================
Files 189 189 —
Lines 19767 19767 —
Branches 6408 6408 —
==========================================
+ Hits 5080 6097 +1017
- Misses 14687 13670 -1017
- Partials 420 467 +47Generated by Codecov Action |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| for key, attribute in DATA_FIELDS.items(): | ||
| if key in all_params and all_params[key] is not None: | ||
| set_data_normalized(span, attribute, all_params[key], unpack=False) | ||
| span.set_data(attribute, all_params[key]) |
There was a problem hiding this comment.
Unnormalized tool call data on spans
Medium Severity
Replacing set_data_normalized with span.set_data for all DATA_FIELDS can store non-primitive values (notably function_call/tool_calls) without JSON normalization. This can yield inconsistent encoding versus other paths that still use set_data_normalized, and may break span serialization or downstream consumers expecting a JSON string for SPANDATA.GEN_AI_RESPONSE_TOOL_CALLS.
| for key, attribute in DATA_FIELDS.items(): | ||
| if key in all_params and all_params[key] is not None: | ||
| set_data_normalized(span, attribute, all_params[key], unpack=False) | ||
| span.set_data(attribute, all_params[key]) |
There was a problem hiding this comment.
Bug: The change in on_llm_start removes necessary serialization for complex objects like function_call and tool_calls by replacing set_data_normalized() with span.set_data().
Severity: MEDIUM
Suggested Fix
The logic should be updated to selectively apply span.set_data() only to primitive types within DATA_FIELDS. For complex fields like function_call and tool_calls, continue to use set_data_normalized() to ensure proper serialization.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: sentry_sdk/integrations/langchain.py#L392
Potential issue: The change in `on_llm_start` replaces `set_data_normalized()` with
`span.set_data()` for all `DATA_FIELDS`. While the intent was to do this for primitive
types, this change also affects complex objects like `function_call` and `tool_calls`.
These fields require serialization to a JSON string, which `set_data_normalized()`
performs but `span.set_data()` does not. Passing these complex objects directly can lead
to serialization failures when the span is transmitted, resulting in lost telemetry
data. This creates an inconsistency with `on_chat_model_start`, which still correctly
uses normalization for the same fields.
Did we get this right? 👍 / 👎 to inform future reviews.


Description
Remove
set_data_normalized()for attributes that do not require normalization.Issues
Reminders
tox -e linters.feat:,fix:,ref:,meta:)