Skip to content

fix: Preserve KvlistValue as JSON object when nested_kv_max_depth > 0#317

Open
sheldor1510 wants to merge 1 commit intostreamfold:mainfrom
introspection-org:fix/nested-kvlist-flatten-guard
Open

fix: Preserve KvlistValue as JSON object when nested_kv_max_depth > 0#317
sheldor1510 wants to merge 1 commit intostreamfold:mainfrom
introspection-org:fix/nested-kvlist-flatten-guard

Conversation

@sheldor1510
Copy link

@sheldor1510 sheldor1510 commented Mar 4, 2026

Summary

When nested_kv_max_depth > 0, the flatten_keyvalues_* functions were unconditionally recursing into KvlistValue entries, expanding them into dotted paths. This caused KvlistValue attributes to be silently dropped when sitting alongside scalar siblings at the same level.

The Bug

When a KvlistValue sits alongside scalar siblings at the same level — for example:

is_active          = BoolValue(true)          // scalar
user_message       = KvlistValue{role, content}  // nested object

The flattening code would unconditionally recurse into the KvlistValue, producing dotted keys like user_message.role and user_message.content. These dotted keys then conflicted with the depth-tracking logic from nested_kv_max_depth > 0, causing the nested attribute to be silently lost in the ClickHouse output.

Tweaking max_depth does not help — the KvlistValue was being destroyed by flatten_keyvalues before depth tracking ever ran.

The Fix

Add a guard (if self.nested_kv_max_depth == 0) to the KvlistValue match arm in all three flatten variants:

  • flatten_keyvalues_map
  • flatten_keyvalues_borrowed
  • flatten_keyvalues_owned

When nested_kv_max_depth == 0 (flat mode, the default): behavior is unchanged — KvlistValue is recursively expanded into dotted paths.

When nested_kv_max_depth > 0 (nested mode): KvlistValue entries fall through to anyvalue_to_jsontype / anyvalue_to_jsontype_owned, which correctly converts them to JSON Objects with depth tracking.

Tests

Three new tests demonstrate the fix and verify backwards compatibility:

  1. test_nested_mode_preserves_kvlist_alongside_scalar_siblings — The core bug scenario: a KvlistValue (user_message) next to a scalar sibling (is_active) with nested_kv_max_depth = 3. Verifies the KvlistValue is stored as a JSON Object (not flattened into dotted paths), and checks the inner values (role, content) are correct.

  2. test_nested_mode_owned_preserves_kvlist_alongside_scalar_siblings — Same scenario for the owned variant (transform_attrs_kv_owned / flatten_keyvalues_owned), which is used when attributes need to outlive the source protobuf data.

  3. test_flat_mode_still_flattens_kvlist_into_dotted_paths — Backwards compatibility: with nested_kv_max_depth == 0 (the default), KvlistValue IS still flattened into dotted paths (user_message.role, user_message.content), ensuring no regression for users not using nested mode.

All 15 transformer tests pass.

When nested_kv_max_depth > 0 (default 3), KvlistValue attributes should
be stored as JSON objects rather than flattened into dotted paths. The
flatten_keyvalues_* functions were always recursing into KvlistValues,
which caused them to be silently dropped when sitting alongside scalar
siblings under the same parent.

Now KvlistValues are only flattened in flat mode (max_depth == 0). In
nested mode, they pass through to anyvalue_to_jsontype which correctly
handles Object conversion with depth tracking.
@sheldor1510 sheldor1510 force-pushed the fix/nested-kvlist-flatten-guard branch from b7d732e to 2779b7f Compare March 4, 2026 20:36
@rjenkins rjenkins requested a review from mheffner March 4, 2026 21:00
@mheffner mheffner self-assigned this Mar 5, 2026
@mheffner
Copy link
Contributor

mheffner commented Mar 5, 2026

Hey @sheldor1510!

So I'm trying to understand this a bit more since I'm not seeing the end behavior that you're noticing here. There's a test in the mod.rs file of the clickhouse exporter named test_export_nested_traces_to_clickhouse that I've been using to verify the end-to-end experience. It's not run by default since it requires a localhost Clickhouse, but you can run it with:

cargo test -p rotel test_export_nested_traces_to_clickhouse -- --ignored --nocapture

I've applied the following patch to the main branch which mimics your test scenario:
nested_kvlist_flatten.patch

When I then query ClickHouse I see this as the SpanAttributes:

SpanAttributes:     {
 "is_active": true,
 "user_message": {
  "content": "Hello",
  "role": "user"
 }
}

I believe that is the correct representation of these fields based on our interpretation of OTLP -> JSON encoding of attributes. I'm curious if you are expecting a different JSON representation in Clickhouse? When you query this data does it appear differently?

I see your test is comparing against the intermediary values generated from the transformer before they are sent to Clickhouse. I do think there is a difference there because Clickhouse treats dotted notations as flattened, nested keys by default. However, the end result should be the same.

I DO think there may be an issue here in that the original key flattening doesn't respect the depth argument and may actually produce a deeper depth in keys, eg. a.b.c, than if there were a top-level ArrayValue. However, I think that's different than this issue.

Anyways, would like to see what you're experiencing when these are encoded into Clickhouse and if the CH representation is coming out different.

@mheffner
Copy link
Contributor

@sheldor1510 Curious if you can test against #327

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants