Skip to content

Comments

WIP: Rewrite#67

Open
jaboatman wants to merge 50 commits intohydro-project:mainfrom
otonoma:main
Open

WIP: Rewrite#67
jaboatman wants to merge 50 commits intohydro-project:mainfrom
otonoma:main

Conversation

@jaboatman
Copy link

Hi,

At Otonoma we needed rust_sitter for internal use, however, in the process of writing a grammar we decided to rewrite portions of it for ease of use. Many of these changes are stylistic (and significantly breaking), so I don't expect you to merge this pull request, we intend to maintain our fork separately.
I mostly wanted to make this community aware of the changes we were making in our fork - some of the features would fit back in nicely, others maybe not.

Here is a brief summary of the changes:

  • Removed #[rust_sitter::grammar(...)] in favor of explicit derive(Rule) on types.
  • Implemented Extract on various additional base types. In particular, numeric types will defer to FromStr, String will extract directly, tuples can be extract.
  • Implemented a tree-sitter like DSL in leaf extraction, so you can write a leaf like this:
#[leaf(seq(Ident, ":", Variable, "?"))]
bindings: Vec<(String, (), String, Option<()>)>,
  • With the above, you can specify other rules which will be used to determine the grammar rule to use, but extract as a different field. For example:
// Defines identifiers in the language
#[derive(Rule)]
#[leaf(re(r"[a-zA-Z_]+"))]
struct Ident;

// enum Expression defined somewhere...
// Roughly corresponds to this tree-sitter rule:
// assign: $ => seq(
//    field("name", $.ident),
//    "=",
//    field("value", $.expression),
//  )
pub struct Assign {
    #[leaf(Ident)] // tree-sitter uses `Ident` for the rule, but `Extract` still generates a `String`
    name: String,
    #[text("=")]
    _eq: (),
    value: Expression,
}
  • Improved error reporting - as much as possible, errors are produced in the macro expansion phase. Tree sitter generation errors are still determined during build.rs execution, but are printed more nicely, so you get the classic "Unresolved conflict for symbol sequence: ..." error message when applicable.
  • Others, and more planned.

Let me know what you think and thank you for the project

Joeoc2001 and others added 30 commits March 13, 2025 14:24
commit incoming to do the same with `leaf`
like rules in them:
* Can now provide text directly
* Can now provide a `re` or `pattern` function to specify a function
* Can now specify `choice` directly
* Can now specify `seq` directly
* Can now specify `optional` directly
the original implementation, we will need to allow TsInput to handle
references to other rules.
* Can transform on a node directly for more complex parsing
* Pass along `Point` for better span handling
* Begin experimenting with mapping functions...
@MingweiSamuel MingweiSamuel requested a review from shadaj September 3, 2025 01:57
@MingweiSamuel
Copy link
Member

Appreciate all of this work and interest on rust-sitter, pinging @shadaj to see how much of this we could upstream (eventually)

@jaboatman
Copy link
Author

Appreciate all of this work and interest on rust-sitter, pinging @shadaj to see how much of this we could upstream (eventually)

I forgot I opened this pull request - hopefully you weren't getting spammed with every commit.

I have since hacked on this project a lot - I imagine it is mostly unrecognizable at this point. If you would like to review or walk through it, let me know. Otherwise, it was my intention to maintain a separate fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants