Skip to content

Comments

ref(google-genai): Remove set_data_normalized for the gen_ai.response.finish_reasons attribute#5506

Draft
alexander-alderman-webb wants to merge 1 commit intomasterfrom
webb/google-genai/remove-set-data-normalized
Draft

ref(google-genai): Remove set_data_normalized for the gen_ai.response.finish_reasons attribute#5506
alexander-alderman-webb wants to merge 1 commit intomasterfrom
webb/google-genai/remove-set-data-normalized

Conversation

@alexander-alderman-webb
Copy link
Contributor

@alexander-alderman-webb alexander-alderman-webb commented Feb 23, 2026

Description

Remove set_data_normalized() for attributes that do not require normalization.

Since extract_finish_reasons(response) returns a list of strings, we can directly serialize to JSON.

This attribute does not adhere to our conventions as the conventions indicate type it be a string: https://github.com/getsentry/sentry-conventions/blob/0e79c16961a747152afc2c8d86311351d2c05554/model/attributes/gen_ai/gen_ai__response__finish_reasons.json#L4

Issues

Reminders

@alexander-alderman-webb alexander-alderman-webb changed the title ref(google-genai): Remove set_data_normalized for gen_ai.response.finish_reasons attribute ref(google-genai): Remove set_data_normalized for gen_ai.response.finish_reasons attribute Feb 23, 2026
@alexander-alderman-webb alexander-alderman-webb changed the title ref(google-genai): Remove set_data_normalized for gen_ai.response.finish_reasons attribute ref(google-genai): Remove set_data_normalized for the gen_ai.response.finish_reasons attribute Feb 23, 2026
@alexander-alderman-webb alexander-alderman-webb marked this pull request as ready for review February 23, 2026 13:53
@alexander-alderman-webb alexander-alderman-webb requested a review from a team as a code owner February 23, 2026 13:53
@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

Semver Impact of This PR

🟢 Patch (bug fixes)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


Bug Fixes 🐛

Openai

  • Avoid consuming iterables passed to the Completions API by alexander-alderman-webb in #5489
  • Avoid consuming iterables passed to the Embeddings API by alexander-alderman-webb in #5491

Other

  • (anthropic) Fix token accounting by shellmayr in #5490
  • (google-genai) Remove agent spans for simple requests by alexander-alderman-webb in #5443

Documentation 📚

  • New integration guide by alexander-alderman-webb in #5476

Internal Changes 🔧

  • (agents) Add sentry skills to be used by warden in CI reviews by ericapisani in #5485
  • (ai) Add configuration for dotagents by ericapisani in #5480
  • (github) Add warden configuration by ericapisani in #5484
  • (google-genai) Remove set_data_normalized for the gen_ai.response.finish_reasons attribute by alexander-alderman-webb in #5506
  • (openai-agents) Expect new tool fields by alexander-alderman-webb in #5471
  • (repo) Add .serena to .gitignore by ericapisani in #5464
  • 🤖 Update test matrix with new releases (02/19) by github-actions in #5483
  • 🤖 Update test matrix with new releases (02/18) by github-actions in #5475

🤖 This preview updates automatically when you update the PR.

Comment on lines +884 to 886
span.set_data(
SPANDATA.GEN_AI_RESPONSE_FINISH_REASONS, json.dumps(finish_reasons)
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Using json.dumps instead of set_data_normalized incorrectly serializes single-element finish_reasons lists, creating an inconsistency with the streaming path.
Severity: MEDIUM

Suggested Fix

Revert to using set_data_normalized(span, SPANDATA.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons). This function correctly handles both single-element lists by unpacking them and multi-element lists by JSON-serializing them, maintaining consistency with the streaming code path.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: sentry_sdk/integrations/google_genai/utils.py#L884-L886

Potential issue: The switch from `set_data_normalized` to `json.dumps` for setting the
`finish_reasons` data on a span introduces a data format inconsistency. The
`set_data_normalized` function unpacks single-element lists, meaning a list like
`["STOP"]` becomes the plain string `"STOP"`. The new implementation using `json.dumps`
will serialize it as the string `'["STOP"]'`. This creates a discrepancy with the
streaming implementation, which still uses `set_data_normalized` and produces a plain
string for a single finish reason. This change will break consumers of the span data
that expect a plain string for a single reason in non-streaming scenarios.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aim is to have a consistent type for an attribute, so if lists are possible they should always be lists.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

set_data_normalized(
span, SPANDATA.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons
span.set_data(
SPANDATA.GEN_AI_RESPONSE_FINISH_REASONS, json.dumps(finish_reasons)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent finish_reasons format between streaming and non-streaming

High Severity

The change creates inconsistent serialization for gen_ai.response.finish_reasons between streaming and non-streaming paths. Non-streaming now always stores as JSON array (e.g., '["STOP"]'), while streaming still uses set_data_normalized which unpacks single-element lists to plain strings (e.g., "STOP"). OpenTelemetry specifies this attribute as string[] type, requiring consistent array format. Consumers parsing this attribute will encounter different formats depending on whether streaming was used, breaking data consistency.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aim is to have a consistent type for an attribute, so if lists are possible they should always be lists.

@alexander-alderman-webb alexander-alderman-webb marked this pull request as draft February 23, 2026 14:00
@alexander-alderman-webb alexander-alderman-webb marked this pull request as ready for review February 23, 2026 15:21
@alexander-alderman-webb alexander-alderman-webb marked this pull request as draft February 23, 2026 15:53
@alexander-alderman-webb alexander-alderman-webb marked this pull request as draft February 23, 2026 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant