Skip to content

Unified: Get rid of all $children fields#21841

Open
tausbn wants to merge 31 commits into
mainfrom
tausbn/unified-swift-named-body-fields
Open

Unified: Get rid of all $children fields#21841
tausbn wants to merge 31 commits into
mainfrom
tausbn/unified-swift-named-body-fields

Conversation

@tausbn
Copy link
Copy Markdown
Contributor

@tausbn tausbn commented May 12, 2026

A grab bag of changes, mostly making nodes named, or adding fields, and in a few cases doing some more advanced rewrites. Can be reviewed commit-by-commit.

tausbn added 30 commits May 12, 2026 12:57
Part 1 of N of "getting rid of $children" in node-types.yml

Note: in one of the cases the affected node still has the $children
field present. This is because there's some weirdness about recording
multiline comments as class member separators that I did not want to
figure out how to address right now.
I ended up also aliasing `_async_keyword` to a named node to make it
more consistent with the other node kinds that can be in this field (as
it would be awkward to have two named types and a token here).

Elsewhere in the node types, we'll still have `async?: "async"`, and I
think that's okay.
A lot of changes, but for the most part these are just adding named
fields in places where they make sense.

After this, there are still ~20 instances of unnamed children appearing.
This required a change in a different place, due to aliasing.
Some nodes with a single child (arguably redundant to do, but I think
it's nice to have the types be consistent), and also an instance of
ensuring that all branches of a `choice` expose consistent field names.
Not entirely happy about the mixed nature of the `kind` filed (having
both tokens and the named node `throw_keyword` in there), but that's a
problem for a different time.
Doing this involved materialising a lot of previously anonymous nodes,
and I'm not entirely sure it's the best solution, but the node types
look decent enough.
Of note: this involved un-inlining where_clause.
Not entirely sure about the `binding?` field on `pattern`, but it looks
like that might actually be useful.
Mostly by materialising a bunch of (useful) intermediate nodes.
I'm not entirely happy about this solution, but it seemed to be the most
straightforward way of avoiding various kinds of token bleeding.
This named node (which is in fact emitted by the scanner as an
`external`) was appearing as a child of `class_body` because of inlining
via `_class_member_separator`. This, in itself, appears to be somewhat
of a hack, to handle cases where a multiline comment signals the end of
a class member.

To fix this, we make the external node _unnamed_, but keep the `extras`
node _named_ (so we can still extract it from the parse tree), and we
add a new rule `multiline_comment` that mediates between the two. That
way, the use inside `_class_member_separator` can use the unnamed
variant, and no node is pushed into $children.
Introduces (by making it named) a `block` node, and conversely makes
`statements` anonymous. This enables us to sensibly distinguish between
the "then" and "else" branch of an `if_statement`, which we were not
able to previously.
The field representation would have made it difficult to figure out
which parameters correspond to which default values and attributes, so
instead we now encapsulate these in a new `function_parameter` node.
Same as in the preceding commit, these items do not make sense as
separate fields on the parent node, so we materialise (or create new)
intermediate nodes to group them together.
Same as in the preceding commit.
This groups together a bunch of related values that would otherwise be
impossible to match up correctly.
All of these fields have contents that are uniquely determined by the
node they appear on, so they convey no information.
@tausbn tausbn marked this pull request as ready for review May 26, 2026 14:04
Copilot AI review requested due to automatic review settings May 26, 2026 14:04
@tausbn tausbn requested a review from a team as a code owner May 26, 2026 14:04
@tausbn tausbn added the no-change-note-required This PR does not need a change note label May 26, 2026
@tausbn tausbn requested a review from asgerf May 26, 2026 14:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the unified Swift AST shape to eliminate generic $children lists in favor of named, typed fields (and a few new wrapper nodes), aligning the tree-sitter grammar, node-types metadata, and the CodeQL dbscheme.

Changes:

  • Replaces many Swift node $children/child relations with explicit named fields/relations in unified.dbscheme.
  • Updates node-types.yml to describe the new fielded AST structure (e.g., block.statement*, call_expression.function/suffix, tuple_expression.element+, etc.).
  • Adjusts the Swift tree-sitter grammar to emit the new fields and new node types (e.g., block, simple_user_type, parenthesized_type, tuple_expression_item, dictionary_literal_item).
Show a summary per file
File Description
unified/ql/lib/unified.dbscheme Reshapes the Swift schema to remove $children-style tables and introduce explicit field relations and new node types.
unified/extractor/tree-sitter-swift/node-types.yml Updates node field definitions to match the new named-field AST (removing $children usage).
unified/extractor/tree-sitter-swift/grammar.js Updates parsing rules/fields and introduces/renames nodes to produce the new AST shape.

Copilot's findings

  • Files reviewed: 2/4 changed files
  • Comments generated: 1

Comment thread unified/extractor/tree-sitter-swift/grammar.js Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants