Skip to content

phall1/cyrs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cyrs — a Rust front-end for Cypher / GQL

crates.io docs.rs CI MSRV License: Apache-2.0 OR MIT

Lexer, recovering parser, lossless CST, typed AST, HIR with name resolution, schema-aware semantic analysis, diagnostics engine, formatter, incremental analysis database, language server, agent-facing JSON API, CLI. openCypher v9 front-end (93.2 % TCK acceptance) with in-progress GQL ISO/IEC 39075:2024 support (parser bootstrap landed — 11.1 % on a 7-feature seed corpus; see coverage).

Rust-compiler-grade. No execution. No domain coupling.

cyrs-lsp + nvim demo


What this is

A compiler front-end for a query language — not a database. Downstream consumers execute the typed Plan IR against their own storage.

  Cypher / GQL text
        │
        ▼
  cyrs-syntax      lexer, recovering parser, lossless CST
        │
        ▼
  cyrs-ast         typed AST wrappers over the CST
        │
        ▼
  cyrs-hir         lowered HIR, name resolution, scope graph
        │
        ▼
  cyrs-sema        type system + semantic analysis (schema-aware)
        │
        ▼
  cyrs-plan        logical read / write Plan IR
        │
        ▼
  consumer executes against its own storage

Schema, custom functions, and write-clause semantics are plugged in through trait implementations — not baked into this workspace. Graph stores, analytic engines, IDE integrations, and agent tooling are equal-weight downstream consumers.

The authoritative crate graph and allowed-edges list lives in docs/specs/0001-cypher-frontend.md §3.

Specs

Embedders: choosing your integration depth

cyrs has six plausible consumption layers (CST, AST, HIR, sema, Plan, agent JSON). Which one you should consume depends on what you're building (graph database vs. IDE vs. rewriter vs. parser-bench). docs/integration-depth.md is the decision table + per-layer reference that answers that question before you cargo add anything.


Quickstart

cargo install cyrs-cli
cypher parse demo/samples/good.cyp
cypher fmt   demo/samples/needs_fmt.cyp
cypher check demo/samples/unknown_var.cyp

cyrs-cli ships the cypher binary with parse, check, fmt, plan, explain, and schema-file operations (schema load, schema check, schema diff; see spec 0002). The Rust API is available as the cyrs-lang meta-crate.


Features

  • Lossless CST — every byte preserved, round-trip guaranteed.
  • Recovering parser — editor-grade: one bad token does not cascade.
  • Typed AST — codegen'd from cypher.ungrammar; zero hand-written accessors.
  • Scope graph + name resolution — HIR layer handles WITH, UNWIND, aggregation scopes, and pattern bindings.
  • Schema-aware semantic analysis — schema is a trait (cyrs-schema::SchemaProvider); no hard-coded assumptions.
  • Stable diagnostic codesE0001…, W6000…, N8000…. See spec §10. Codes are SemVer — once assigned, meaning never changes.
  • Idempotent formatterfmt(fmt(x)) == fmt(x), round-trips through the parser.
  • Salsa-backed incremental DBcyrs-db re-computes only the affected queries on every edit.
  • LSP server + JSON agent API — share a single cyrs-lang-services engine layer; zero logic duplication.

Coverage

cyrs is a Cypher front-end first; GQL ISO/IEC 39075:2024 support is a deliberate, in-progress second track. The numbers below are rolling measurements written by the TCK harness (spec §17.5), not aspirations.

Surface Corpus Result Source
openCypher v9 upstream openCypher TCK 2024.3 (220 feature files, 3 897 expanded scenarios) 3 632 / 3 897 accepted (93.2 %) crates/cyrs-tck/tck/full-baseline.md
GQL ISO/IEC 39075:2024 hand-authored bootstrap (7 feature files, 18 expanded scenarios) 2 / 18 accepted (11.1 %) crates/cyrs-tck/tck/gql-iso-39075/baseline.md

Caveat on the openCypher number. "Accepted" means the parser emits zero syntax errors for the scenario's When executing query: step. It is not an end-to-end pass-rate: the front-end does no execution (spec §1.3 N1), and Expected::Error scenarios are still untriaged (Expected::Ignored) — see the baseline file's preamble. Treat 93.2 % as parser acceptance, not semantic conformance.

State of GQL. The GQL track is a parser bootstrap (bead cy-0hj): the corpus pins the GQL-distinct surface so future beads can land parser changes against a stable set of scenarios. The following GQL-only constructs are explicitly not yet implemented:

  • INSERT NODE (GQL insert syntax distinct from Cypher CREATE)
  • FILTER clause
  • REPEATABLE ELEMENTS
  • IS TYPED predicate
  • ANY SHORTEST path selector

If you need GQL parity today, this is not it. If you want a production-track Cypher front-end with a credible path to GQL alignment, read on.


Status

Pre-0.1. The spec is accepted and locked; implementation is in progress. Expect breakage. Do not use in production yet.

Start with the spec: docs/specs/0001-cypher-frontend.md. Twenty-three numbered sections from scope through testing. Before adding features, touching architecture, or filing issues, open the spec and reference section numbers.


Try it

A no-plugin Neovim walkthrough that spins up the language server, publishes diagnostics, and runs format-on-save against real queries:

cargo build --release -p cyrs-lsp
nvim -u demo/nvim/init.lua demo/samples/unclosed_paren.cyp

See demo/README.md for the full tour (samples, format-on-save, CLI comparison) and demo/demo.gif for the recording.

For VS Code / VSCodium, the language client lives at editors/vscode/ — see its README for dev-install instructions (marketplace publishing is a manual maintainer step).


Crates

Crate Purpose
cyrs-syntax Lexer, recovering parser, lossless CST, SyntaxKind
cyrs-ast Typed AST wrappers over the CST
cyrs-hir Lowered HIR, name resolution, scope graph, desugaring
cyrs-sema Semantic analysis + type system
cyrs-schema SchemaProvider trait + supporting types
cyrs-diag Diagnostic type, stable code registry, rendering backends
cyrs-plan Logical read / write plan IR
cyrs-fmt CST-driven formatter
cyrs-db Salsa-based incremental analysis database
cyrs-lang-services Shared completion / hover / rewrite engines
cyrs-lsp Language server binary
cyrs-agent JSON-over-stdio agent API binary
cyrs-cli cypher {parse,check,fmt,explain,plan}
cyrs-tck openCypher TCK harness
cyrs-testkit Shared test fixtures, compiletest runner (dev only)
cyrs-lang Meta-crate re-exporting the library surface

Non-goals

  • No execution engine, runtime, or storage. Consumers own that (spec §1.3 N1, §12.5).
  • No domain concepts. The workspace is deliberately free of application vocabulary — CI greps for it (spec §2.C2).
  • No overlay crate host. Domain extensions live in consumer repositories and plug in via the traits in spec §8 (spec §2.C3).
  • No Neo4jCurrent dialect in v1 — no APOC, no EXISTS {} subqueries, no CALL { ... }, no LOAD CSV, no SHOW, no CYPHER prefixes (spec §9.3).

Tree-sitter grammar

A parallel tree-sitter grammar for Cypher / GQL lives at tree-sitter-cypher/ for editor integrations (Neovim, Helix, GitHub highlighter). The Rust parser in cyrs-syntax is authoritative; the tree-sitter grammar is a hand-maintained artefact kept in lock-step by the cargo xtask tree-sitter-parity gate.

Parity claim: the grammar parses the same TCK v1 surface as the Rust parser — every outcome = "ok" scenario in crates/cyrs-tck/tck/v1.toml parses without (ERROR) nodes, every outcome = "error" scenario produces at least one. Regressions fail CI.

Neovim (nvim-treesitter)

local parser_config = require("nvim-treesitter.parsers").get_parser_configs()
parser_config.cypher = {
  install_info = {
    url = "https://github.com/phall1/cyrs",
    location = "tree-sitter-cypher",
    files = { "src/parser.c" },
    branch = "main",
    generate_requires_npm = true,
    requires_generate_from_grammar = true,
  },
  filetype = "cypher",
}

Then :TSInstall cypher.

Helix (~/.config/helix/languages.toml)

[[language]]
name = "cypher"
scope = "source.cypher"
file-types = ["cyp", "cypher"]
roots = []
comment-token = "//"

[[grammar]]
name = "cypher"
source = { git = "https://github.com/phall1/cyrs", subpath = "tree-sitter-cypher" }

Then hx --grammar fetch && hx --grammar build.

See tree-sitter-cypher/README.md for the full scope list and developer workflow.


Testing

Spec §17 grades testing to the rust-compiler standard:

cargo test --workspace                              # unit + integration + snapshots
cargo insta review                                  # snapshot review
cargo llvm-cov --workspace --html                   # coverage
cargo fuzz run fuzz_parser -- -max_total_time=300   # fuzz (nightly only)
cargo mutants -- -p cyrs-sema                     # mutation testing
cargo bench --workspace                             # criterion benchmarks

Agent context

AGENTS.md is the canonical context an agent reads before working on the front-end. Commits cite the spec section and the corresponding bead ID (cy-{3char}). Beads live at br and track ongoing work.

Stability

Cyrs is pre-1.0; see docs/stability.md for the surface-by-surface stability contract (diagnostic codes, agent wire protocol, schema file format, HIR / Plan IR shape, 1.0 cutover plan). PRs are gated by cargo-semver-checks.


Development

After cloning, install the pre-commit hook so cargo xtask gate runs automatically on every commit:

bash scripts/install-hooks.sh

The gate runs cargo fmt --check, cargo clippy -D warnings, cargo test, and cargo deny check against the workspace.


License

Dual-licensed under either of

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual-licensed as above, without any additional terms or conditions.

About

No description, website, or topics provided.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors