Lexer, recovering parser, lossless CST, typed AST, HIR with name resolution, schema-aware semantic analysis, diagnostics engine, formatter, incremental analysis database, language server, agent-facing JSON API, CLI. openCypher v9 front-end (93.2 % TCK acceptance) with in-progress GQL ISO/IEC 39075:2024 support (parser bootstrap landed — 11.1 % on a 7-feature seed corpus; see coverage).
Rust-compiler-grade. No execution. No domain coupling.
A compiler front-end for a query language — not a database. Downstream consumers execute the typed Plan IR against their own storage.
Cypher / GQL text
│
▼
cyrs-syntax lexer, recovering parser, lossless CST
│
▼
cyrs-ast typed AST wrappers over the CST
│
▼
cyrs-hir lowered HIR, name resolution, scope graph
│
▼
cyrs-sema type system + semantic analysis (schema-aware)
│
▼
cyrs-plan logical read / write Plan IR
│
▼
consumer executes against its own storage
Schema, custom functions, and write-clause semantics are plugged in through trait implementations — not baked into this workspace. Graph stores, analytic engines, IDE integrations, and agent tooling are equal-weight downstream consumers.
The authoritative crate graph and allowed-edges list lives in
docs/specs/0001-cypher-frontend.md
§3.
0001-cypher-frontend.md— architecture, crate graph, testing bar.0002-schema-file-format.md—schema.tomlfile format + diff.0003-project-manifest.md—cypher-project.tomlworkspace manifest.0004-interop-surfaces.md— WASM, C FFI, PyO3, LSP-Web, tree-sitter parity.
cyrs has six plausible consumption layers (CST, AST, HIR, sema, Plan,
agent JSON). Which one you should consume depends on what you're
building (graph database vs. IDE vs. rewriter vs. parser-bench).
docs/integration-depth.md is the
decision table + per-layer reference that answers that question
before you cargo add anything.
cargo install cyrs-cli
cypher parse demo/samples/good.cyp
cypher fmt demo/samples/needs_fmt.cyp
cypher check demo/samples/unknown_var.cypcyrs-cli ships the cypher binary with parse, check, fmt,
plan, explain, and schema-file operations (schema load,
schema check, schema diff; see
spec 0002). The Rust API is
available as the cyrs-lang meta-crate.
- Lossless CST — every byte preserved, round-trip guaranteed.
- Recovering parser — editor-grade: one bad token does not cascade.
- Typed AST — codegen'd from
cypher.ungrammar; zero hand-written accessors. - Scope graph + name resolution — HIR layer handles
WITH,UNWIND, aggregation scopes, and pattern bindings. - Schema-aware semantic analysis — schema is a trait
(
cyrs-schema::SchemaProvider); no hard-coded assumptions. - Stable diagnostic codes —
E0001…,W6000…,N8000…. See spec §10. Codes are SemVer — once assigned, meaning never changes. - Idempotent formatter —
fmt(fmt(x)) == fmt(x), round-trips through the parser. - Salsa-backed incremental DB —
cyrs-dbre-computes only the affected queries on every edit. - LSP server + JSON agent API — share a single
cyrs-lang-servicesengine layer; zero logic duplication.
cyrs is a Cypher front-end first; GQL ISO/IEC 39075:2024 support is a deliberate, in-progress second track. The numbers below are rolling measurements written by the TCK harness (spec §17.5), not aspirations.
| Surface | Corpus | Result | Source |
|---|---|---|---|
| openCypher v9 | upstream openCypher TCK 2024.3 (220 feature files, 3 897 expanded scenarios) |
3 632 / 3 897 accepted (93.2 %) | crates/cyrs-tck/tck/full-baseline.md |
| GQL ISO/IEC 39075:2024 | hand-authored bootstrap (7 feature files, 18 expanded scenarios) | 2 / 18 accepted (11.1 %) | crates/cyrs-tck/tck/gql-iso-39075/baseline.md |
Caveat on the openCypher number. "Accepted" means the parser emits
zero syntax errors for the scenario's When executing query: step. It
is not an end-to-end pass-rate: the front-end does no execution
(spec §1.3 N1), and Expected::Error scenarios are still untriaged
(Expected::Ignored) — see the baseline file's preamble. Treat 93.2 %
as parser acceptance, not semantic conformance.
State of GQL. The GQL track is a parser bootstrap (bead cy-0hj): the corpus pins the GQL-distinct surface so future beads can land parser changes against a stable set of scenarios. The following GQL-only constructs are explicitly not yet implemented:
INSERT NODE(GQL insert syntax distinct from CypherCREATE)FILTERclauseREPEATABLE ELEMENTSIS TYPEDpredicateANY SHORTESTpath selector
If you need GQL parity today, this is not it. If you want a production-track Cypher front-end with a credible path to GQL alignment, read on.
Pre-0.1. The spec is accepted and locked; implementation is in progress. Expect breakage. Do not use in production yet.
Start with the spec:
docs/specs/0001-cypher-frontend.md.
Twenty-three numbered sections from scope through testing. Before adding
features, touching architecture, or filing issues, open the spec and
reference section numbers.
A no-plugin Neovim walkthrough that spins up the language server, publishes diagnostics, and runs format-on-save against real queries:
cargo build --release -p cyrs-lsp
nvim -u demo/nvim/init.lua demo/samples/unclosed_paren.cypSee demo/README.md for the full tour (samples,
format-on-save, CLI comparison) and demo/demo.gif
for the recording.
For VS Code / VSCodium, the language client lives at
editors/vscode/ — see its
README for dev-install instructions
(marketplace publishing is a manual maintainer step).
| Crate | Purpose |
|---|---|
cyrs-syntax |
Lexer, recovering parser, lossless CST, SyntaxKind |
cyrs-ast |
Typed AST wrappers over the CST |
cyrs-hir |
Lowered HIR, name resolution, scope graph, desugaring |
cyrs-sema |
Semantic analysis + type system |
cyrs-schema |
SchemaProvider trait + supporting types |
cyrs-diag |
Diagnostic type, stable code registry, rendering backends |
cyrs-plan |
Logical read / write plan IR |
cyrs-fmt |
CST-driven formatter |
cyrs-db |
Salsa-based incremental analysis database |
cyrs-lang-services |
Shared completion / hover / rewrite engines |
cyrs-lsp |
Language server binary |
cyrs-agent |
JSON-over-stdio agent API binary |
cyrs-cli |
cypher {parse,check,fmt,explain,plan} |
cyrs-tck |
openCypher TCK harness |
cyrs-testkit |
Shared test fixtures, compiletest runner (dev only) |
cyrs-lang |
Meta-crate re-exporting the library surface |
- No execution engine, runtime, or storage. Consumers own that (spec §1.3 N1, §12.5).
- No domain concepts. The workspace is deliberately free of application vocabulary — CI greps for it (spec §2.C2).
- No overlay crate host. Domain extensions live in consumer repositories and plug in via the traits in spec §8 (spec §2.C3).
- No
Neo4jCurrentdialect in v1 — no APOC, noEXISTS {}subqueries, noCALL { ... }, noLOAD CSV, noSHOW, noCYPHERprefixes (spec §9.3).
A parallel tree-sitter grammar for Cypher / GQL lives at
tree-sitter-cypher/ for editor integrations
(Neovim, Helix, GitHub highlighter). The Rust parser in
cyrs-syntax is authoritative; the tree-sitter grammar is a
hand-maintained artefact kept in lock-step by the
cargo xtask tree-sitter-parity gate.
Parity claim: the grammar parses the same TCK v1 surface as the Rust
parser — every outcome = "ok" scenario in
crates/cyrs-tck/tck/v1.toml parses without (ERROR) nodes, every
outcome = "error" scenario produces at least one. Regressions fail CI.
local parser_config = require("nvim-treesitter.parsers").get_parser_configs()
parser_config.cypher = {
install_info = {
url = "https://github.com/phall1/cyrs",
location = "tree-sitter-cypher",
files = { "src/parser.c" },
branch = "main",
generate_requires_npm = true,
requires_generate_from_grammar = true,
},
filetype = "cypher",
}Then :TSInstall cypher.
[[language]]
name = "cypher"
scope = "source.cypher"
file-types = ["cyp", "cypher"]
roots = []
comment-token = "//"
[[grammar]]
name = "cypher"
source = { git = "https://github.com/phall1/cyrs", subpath = "tree-sitter-cypher" }Then hx --grammar fetch && hx --grammar build.
See tree-sitter-cypher/README.md for
the full scope list and developer workflow.
Spec §17 grades testing to the rust-compiler standard:
cargo test --workspace # unit + integration + snapshots
cargo insta review # snapshot review
cargo llvm-cov --workspace --html # coverage
cargo fuzz run fuzz_parser -- -max_total_time=300 # fuzz (nightly only)
cargo mutants -- -p cyrs-sema # mutation testing
cargo bench --workspace # criterion benchmarks
AGENTS.md is the canonical context an agent reads before
working on the front-end. Commits cite the spec section and the
corresponding bead ID (cy-{3char}). Beads live at br and track
ongoing work.
Cyrs is pre-1.0; see docs/stability.md for the
surface-by-surface stability contract (diagnostic codes, agent wire
protocol, schema file format, HIR / Plan IR shape, 1.0 cutover plan).
PRs are gated by cargo-semver-checks.
After cloning, install the pre-commit hook so cargo xtask gate
runs automatically on every commit:
bash scripts/install-hooks.shThe gate runs cargo fmt --check, cargo clippy -D warnings,
cargo test, and cargo deny check against the workspace.
Dual-licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT license (LICENSE-MIT)
at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual-licensed as above, without any additional terms or conditions.
