Skip to content

Fix #237 - Support multi-dimensional arrays in DataItemJsonConverter (supersedes #238)#243

Merged
markjbrown merged 2 commits into
mainfrom
fix/237-multidim-array-support
May 16, 2026
Merged

Fix #237 - Support multi-dimensional arrays in DataItemJsonConverter (supersedes #238)#243
markjbrown merged 2 commits into
mainfrom
fix/237-multidim-array-support

Conversation

@philnach
Copy link
Copy Markdown
Collaborator

Summary

Fixes #237 by adding multi-dimensional array support to DataItemJsonConverter. Supersedes #238 — original fix authored by @mlankamp, refactored to address review feedback and harden against pathological input.

What changed

  • WriteFieldValue no longer recurses into itself with a sentinel empty-string field name. Instead, two dedicated helpers handle in-array writes:
    • WriteArrayItems(writer, items, includeNullFields, depth) — iterates items inside an already-opened JSON array.
    • WriteArrayItemValue(writer, item, includeNullFields, depth) — single per-item dispatcher; the new nested-IEnumerable branch opens an unnamed nested array via WriteStartArray() / WriteEndArray().
  • Removes the implicit IsNullOrWhiteSpace(propertyName) sentinel from Fixes #237 - support multi-dimensional arrays #238 — legitimate empty-string and whitespace property names are now preserved verbatim.
  • Adds MaxJsonDepth = 64 and an EnsureDepth guard threaded through WriteDataItem, WriteFieldValue, WriteArrayItems, and WriteArrayItemValue. Pathological / recursively nested input now throws InvalidOperationException with a clear message instead of stack-overflowing the host process.

Why

The original PR #238 worked, but during review I had two design concerns:

  1. Sentinel-string overload. Passing "" as fieldName and branching on IsNullOrWhiteSpace overloads the method contract and makes legitimate empty-string keys ambiguous with the sentinel.
  2. Recursion with no depth guard. Newtonsoft.Json was historically bitten by stack-overflow on adversarially nested input. Utf8JsonWriter's built-in MaxDepth is bypassed when callers use SkipValidation = true (the existing tests do), so we cannot rely on it. A deterministic exception is the right failure mode for a migration tool.

Tests

17 new tests in DataItemJsonConverterTests.cs:

GeoJSON / issue #237 scenarios

Nested-array edge cases

  • Empty nested array [[]]
  • Nested array with nulls [[null,1],[null]]
  • Mixed scalars [[1,"a",true],[2.5,null]]
  • Nested array of IDataItem
  • Nested array of Dictionary<string,object?>
  • Wide nested array smoke (200×10)

Regression / contract preservation

  • Empty-string property name still emits "":...
  • Whitespace property name preserved verbatim
  • DeserializeAsJsonString round-trip on issue-237 doc

Depth guard / stack-safety

  • Array nesting at MaxJsonDepth succeeds
  • Array nesting at MaxJsonDepth + 1 throws
  • IDataItem nesting at MaxJsonDepth succeeds
  • IDataItem nesting at MaxJsonDepth + 1 throws
  • Mixed object/array nesting overflows correctly

All 125 Common.UnitTests pass (was 108; +17 new). Full solution test suite is green.

Compatibility

  • WriteFieldValue and WriteDataItem public/internal signatures are unchanged — depth threading is via private overloads with default 0.
  • No changes to Extensions/*.
  • The arrayItem is not string && arrayItem is IEnumerable predicate is intentionally kept broad to preserve backward compatibility with the outer field-write path. Tightening it (e.g., to exclude byte[]) is left as a follow-up.

Closes #237
Closes #238

Refactor WriteFieldValue to handle nested (multi-dimensional) arrays via a
dedicated WriteArrayItems / WriteArrayItemValue pair instead of recursing back
into WriteFieldValue with a sentinel empty-string field name. This:

- Restores GeoJSON support: Polygon/MultiPolygon `coordinates` arrays (depth
  3+) now serialize correctly instead of falling back to
  `System.Collections.Generic.List`1[System.Object]`.
- Removes the implicit `IsNullOrWhiteSpace(propertyName)` sentinel so legitimate
  empty-string and whitespace property names are preserved.
- Adds a MaxJsonDepth (64) guard that throws InvalidOperationException for
  pathological / recursively nested input instead of stack-overflowing the host
  process.

Tests:
- 17 new tests in DataItemJsonConverterTests covering GeoJSON Point/LineString/
  Polygon, the full issue-237 document, mixed scalars/nulls/IDataItems inside
  nested arrays, dictionary-of-dictionary nested arrays, wide-array smoke,
  empty/whitespace property-name regression guards, deserialize/re-serialize
  round-trip, and depth-guard at and over the limit (arrays, objects, mixed).
- All 125 Common.UnitTests pass; full solution test suite is green.

Supersedes #238. Original fix authored by Mart Lankamp - thanks!

Closes #237

Co-authored-by: Mart Lankamp <mart.lankamp@users.noreply.github.com>
@mlankamp
Copy link
Copy Markdown

Thank you!

Adds a Notes section to the JSON extension README explaining that multi-dimensional (nested) arrays are now supported on write (GeoJSON LineString/Polygon/MultiPolygon) and that JSON serialization enforces a maximum combined object + array nesting depth of 64 to guard against stack overflow on pathological input. Adds a cross-reference from the SQL Server extension README since JsonFields nested writes go through the same serializer.
@philnach
Copy link
Copy Markdown
Collaborator Author

@markjbrown, can you take a look at this. thucnguyen77 is working with GeoJSON and has a scenario that can leverage dmt with.

@markjbrown
Copy link
Copy Markdown
Collaborator

Will approve.

I understand why someone would want/need this. My only concern is performance impact if someone goes nuts and starts writing data that has extraordinary number of nested levels.

Cosmos is optimized for indexing/search of nesting levels up to 3 layers deep. As the layers of nesting go up, the RU cost go up for both writes (due to indexing) and queries. (point reads are unaffected).

Non-blocking. Just wanted to call out potential edge case.

@markjbrown markjbrown merged commit b841801 into main May 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multi-dimensional arrays when writing json (example geojson polygon)

3 participants