Add a rel module implementing a relational algebra layer for Morel, modelled on Apache Calcite's RelNode / RelBuilder API. The module provides a clean intermediate representation for query plans that is independent of Morel's surface syntax.
Motivation
Morel already compiles from/where/yield expressions into an internal step-based representation (core::StepKind) and performs structural optimizations in from_builder.rs. This works, but the representation is tightly coupled to the surface syntax. A proper relational algebra layer would:
- Give us a clean, SQL-agnostic IR that can be optimized independently of the surface syntax.
- Make it straightforward to target Morel from SQL front-ends, or to emit SQL.
- Provide the foundation for a cost-based query optimizer in the future.
- Align Morel's architecture with the well-understood vocabulary of Apache Calcite.
What we're building
Core operators (Rel enum)
A Rel enum whose variants correspond to the standard logical relational operators, mirroring Calcite's RelNode hierarchy:
| Variant |
Calcite equivalent |
Description |
Rel::TableScan |
LogicalTableScan |
Scan a named table from a schema |
Rel::Filter |
LogicalFilter |
Select rows matching a boolean condition |
Rel::Project |
LogicalProject |
Compute a new row from expressions over the input |
Rel::Aggregate |
LogicalAggregate |
Group by key fields and apply aggregate functions |
Rel::Sort |
LogicalSort |
Order rows; also used for LIMIT/OFFSET |
Rel::Join |
LogicalJoin |
Inner, left, right, full, semi, and anti joins |
Rel::Union |
LogicalUnion |
Union of two or more relations (all or distinct) |
Rel::Intersect |
LogicalIntersect |
Intersection |
Rel::Minus |
LogicalMinus |
Set difference |
Rel::Values |
LogicalValues |
Inline constant rows |
Rel is a Rust enum, not a trait object hierarchy. The set of operators is closed, so exhaustive match is more idiomatic than Box<dyn RelNode> and avoids vtable overhead.
RelBuilder
A fluent builder that maintains an internal stack of Rel nodes, mirroring Calcite's RelBuilder:
let mut b = RelBuilder::new(scott_schema());
b.scan(&["scott", "EMP"])
.filter(b.gt(b.field("SAL"), b.literal(Val::Int(1000), int())))
.project(&[b.field("ENAME"), b.field("SAL")])
.sort(&[b.desc(b.field("SAL"))]);
let plan = b.build();
Builder methods apply structural simplifications eagerly (e.g. filter(false) → empty Values, identity projections are eliminated, duplicate sort keys are dropped).
Expressions
Scalar expressions inside Rel nodes reuse compile::core::Expr — Morel's existing typed expression tree. There is no separate RexNode hierarchy. The one new variant needed is Expr::Scalar(Box<Rel>) for scalar subqueries.
Types
Row types reuse compile::types::Type. A relation's row type is Type::Record(false, …) and the relation's own type is Type::Bag(Box<row_type>) (unordered) or Type::List(Box<row_type>) after a Sort. No new SqlType, RowType, or TypeFactory is introduced.
Acceptance criteria
Port Calcite's RelBuilderTest to Rust (in tests/rel_builder_test.rs) and pass ≥ 80 % of the ~230 test methods. The tests cover all core operators, the builder's simplification rules, alias handling, error conditions, and plan-display output.
Out of scope
- Row-level execution (planning layer only)
- Cost-based optimisation / rule engine
- Physical conventions (
EnumerableConvention, etc.)
- Window functions,
MATCH_RECOGNIZE, PIVOT, UNPIVOT
- Table functions, temporal tables, dynamic parameters
- Correlated subqueries (
Correlate node)
- Integration with the Morel compiler (separate issue)
Add a
relmodule implementing a relational algebra layer for Morel, modelled on Apache Calcite'sRelNode/RelBuilderAPI. The module provides a clean intermediate representation for query plans that is independent of Morel's surface syntax.Motivation
Morel already compiles
from/where/yieldexpressions into an internal step-based representation (core::StepKind) and performs structural optimizations infrom_builder.rs. This works, but the representation is tightly coupled to the surface syntax. A proper relational algebra layer would:What we're building
Core operators (
Relenum)A
Relenum whose variants correspond to the standard logical relational operators, mirroring Calcite'sRelNodehierarchy:Rel::TableScanLogicalTableScanRel::FilterLogicalFilterRel::ProjectLogicalProjectRel::AggregateLogicalAggregateRel::SortLogicalSortLIMIT/OFFSETRel::JoinLogicalJoinRel::UnionLogicalUnionRel::IntersectLogicalIntersectRel::MinusLogicalMinusRel::ValuesLogicalValuesRelis a Rust enum, not a trait object hierarchy. The set of operators is closed, so exhaustivematchis more idiomatic thanBox<dyn RelNode>and avoids vtable overhead.RelBuilderA fluent builder that maintains an internal stack of
Relnodes, mirroring Calcite'sRelBuilder:Builder methods apply structural simplifications eagerly (e.g.
filter(false)→ emptyValues, identity projections are eliminated, duplicate sort keys are dropped).Expressions
Scalar expressions inside
Relnodes reusecompile::core::Expr— Morel's existing typed expression tree. There is no separateRexNodehierarchy. The one new variant needed isExpr::Scalar(Box<Rel>)for scalar subqueries.Types
Row types reuse
compile::types::Type. A relation's row type isType::Record(false, …)and the relation's own type isType::Bag(Box<row_type>)(unordered) orType::List(Box<row_type>)after aSort. No newSqlType,RowType, orTypeFactoryis introduced.Acceptance criteria
Port Calcite's
RelBuilderTestto Rust (intests/rel_builder_test.rs) and pass ≥ 80 % of the ~230 test methods. The tests cover all core operators, the builder's simplification rules, alias handling, error conditions, and plan-display output.Out of scope
EnumerableConvention, etc.)MATCH_RECOGNIZE,PIVOT,UNPIVOTCorrelatenode)