Skip to content

Implement repeat directive#2

Open
piotrszul wants to merge 5 commits intomasterfrom
issue/1_repeat_directive
Open

Implement repeat directive#2
piotrszul wants to merge 5 commits intomasterfrom
issue/1_repeat_directive

Conversation

@piotrszul
Copy link
Copy Markdown
Collaborator

Summary

  • Adds support for the SQL-on-FHIR repeat directive, which performs recursive traversal of nested structures (e.g. Questionnaire.item)
  • Implements traversal using fixed-depth list_cat unrolling with lateral column aliases for top-level repeats (evaluated once per row) and inline unrolling inside lambda scopes
  • Fixes unionAll inside repeat and mixed repeat/non-repeat unionAll branches
  • Adds --repeat-depth CLI flag (default: 5) to control maximum traversal depth
  • Comprehensive spec test coverage including nested repeats, combined forEach/forEachOrNull, sibling repeats, triple nesting, and unionAll combinations

Test plan

  • bun test — all 226 tests pass
  • Manual: test a ViewDefinition with repeat against real Questionnaire/QuestionnaireResponse NDJSON data
  • Manual: verify --repeat-depth 10 increases traversal depth for deeply nested structures

🤖 Generated with Claude Code

piotrszul and others added 5 commits May 1, 2026 20:16
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The vectorised list_cat
approach processes all rows in a single pipeline pass. Depth
defaults to 5 and is configurable via _nestedRepeatDepth.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
For top-level repeats (not inside a lambda), emit each depth level as
a lateral SELECT column alias instead of inlining the full expression
into every list_cat argument. This avoids re-evaluating ancestor levels
and makes DuckDB's deduplication explicit rather than relying on the
optimizer. Falls back to inline unrolling inside lambda scope (forEach,
nested repeat) where lateral aliases cannot reference lambda variables.

Also removes the double null-guard: _col_collection no longer appends
.ifnull2([]) when the value's outputType.nullHandled is already true,
eliminating the redundant coalesce(...,[]).ifnull2([]) chain emitted by
_repeat.

Benchmarks show no measurable runtime difference on the QuestionnaireResponse
repeat_view (ndjson ~9.6s, parquet ~97s), indicating DuckDB already CSEs
the inline expressions. The structural improvement still stands for
correctness and future query plan legibility.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Three bugs in ddb-sql-builder.js:
1. buildRepeatTransformSchema didn't walk _unionAll branches, producing an
   empty JSON schema ({}) when a repeat's column arg contained a unionAll.
2. A standalone _forEach inside a _repeat lambda (e.g. a unionAll branch
   inside repeat) incorrectly tried to iterate via _ri.as_list().list_transform
   instead of projecting the current item as a plain struct.
3. nav with inputType.json=true (a JSON[] repeat entry field accessed outside
   a lambda) generated el.field instead of el->>'field', causing a DuckDB
   conversion error when a non-repeat unionAll branch navigated into that field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes the previously internal _nestedRepeatDepth option as a proper
user-facing flag. Threads repeatDepth through templateToQuery and
buildQuery into astToSql options, and documents it in the README.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant