Split DSRs runtime into layered crates#87
Open
darinkishore wants to merge 15 commits into
Open
Conversation
Twelve-crate layer-aligned split of dspy-rs: dsrs-{core, lm, trace, cache,
predict, evaluate, gepa, data, leaven} on top of the existing bamltype /
bamltype-derive / dsrs-macros foundation. No facade.
Key shape:
- dsrs-core owns the abstract bridge traits (DynPredictor, TraceSink,
CacheBackend, LmClient) and the Facet walker — the surface leaven drives.
- dsrs-leaven implements leaven_core::Artifact / leaven_surface::EditSurface /
leaven_engine::Evaluator for DSRs programs directly, replacing the empty
leaven-dsrs stub crate. DSRs is a leaven-compatible target rather than a
third party that needs a bridge owned by leaven.
- GEPA-only optimizer: COPRO and MIPROv2 deleted. dsrs-gepa is a sunset
candidate, dropped once leaven-gepa is a runnable optimizer and dsrs-leaven
ships real impls.
Companion HTML view in the same directory.
Used Undermind deep search plus Report Writer instead of a hand-rolled literature scan because the request was specifically for the report/research flow and recent literature coverage. The report distills the search into DSRs implementation requirements: typed IR, trace and blame capture, MIPRO-like offline compilation, Pareto selection, and bounded online adaptation. Export note: the CLI files export path requested citation style plain and the API rejected it, so the markdown was exported through the same API with APA style. Left undone: Kha24 was not uploaded as an extra source; Report Writer cited the search-result anchor. A follow-up can tighten DSPy-specific details if that paper is uploaded.
cargo check --workspace succeeds before moving code. cargo test --workspace --no-run succeeds; this gives the split a green compile baseline. Scaffolding: full runtime test execution and coverage measurement happen in the next snapshots so the crate boundary changes can keep their own evidence.
Adds the nine planned crates as real workspace members: dsrs-core, dsrs-lm, dsrs-trace, dsrs-cache, dsrs-predict, dsrs-evaluate, dsrs-gepa, dsrs-data, and dsrs-leaven. No code moved yet; this isolates workspace topology from the later extraction churn. cargo metadata sees all dsrs-* crates and cargo check --workspace succeeds, including the dsrs-leaven path deps against ../leaven. Scaffolding: every lib.rs is intentionally empty until its extraction task owns the crate surface.
Moves Signature, Module, ModuleExt, SignatureSchema, Predicted, error types, augmentation primitives, raw Example/Prediction, DynPredictor discovery, and LmUsage out of dspy-rs into dsrs-core. Why this shape: LmUsage had to move with Predicted/Prediction/errors because otherwise the old LM crate and new core crate produced distinct usage types at call boundaries. LM client state, adapters, and global settings stay in dspy-rs for the later dsrs-lm extraction so core does not learn concrete LM behavior. Kept temporary internal dspy-rs module aliases only so the workspace remains compiling between extraction steps; final hard cutover still deletes the dspy-rs facade. Verification: cargo check --workspace; cargo test -p dsrs-core; cargo test --workspace --no-run. Scaffolding: dsrs-core owns the discovery trait but Predict itself still lives in dspy-rs until the dsrs-predict extraction, so two real Predict-specific dyn walker tests remain deferred to that crate.
Moves trace context, DAG, executor, and tracked-value conversion into dsrs-trace with a direct dependency on dsrs-core row and prediction types. Why now: trace is a concrete leaf over core data, so extracting it before cache/LM prevents observability code from staying entangled with the monolith while later crates move. The old dspy-rs trace module is temporarily a pass-through to keep intermediate workspace tests compiling; the final split still removes dspy-rs. Verification: cargo check --workspace; cargo test -p dsrs-trace; cargo test --workspace --no-run. Scaffolding: dsrs-trace has no dedicated tests yet because the existing trace coverage is still exercised through dspy-rs examples/tests until dsrs-predict owns call recording.
Moves the Foyer-backed response cache, Cache trait, and CacheEntry into dsrs-cache. The crate now depends on dsrs-core for RawExample and Prediction instead of reaching through dspy-rs. Why this, not waiting for LM: cache is a concrete leaf capability over core row types. Extracting it first lets the later dsrs-lm move depend on cache by name instead of dragging utils along. The old dspy-rs utils::cache module is temporarily a pass-through to keep intermediate imports compiling; final hard cutover removes it with dspy-rs. Verification: cargo check --workspace; cargo test -p dsrs-cache; cargo test --workspace --no-run. Scaffolding: no cache-specific tests moved yet; existing dspy-rs LM cache tests still exercise this code through the temporary pass-through.
Moves core/lm, adapter, and global settings together into dsrs-lm. They move as one unit because Settings stores Arc<dyn Adapter>, Predict needs both ChatAdapter and LM, and the adapter owns typed parse/format behavior over dsrs-core schemas. Also lifts typed Example<S> into dsrs-core so dsrs-lm can format demos without depending on dsrs-predict. That keeps the crate DAG pointed upward: predict depends on lm/core, lm depends on core/cache, not the reverse. The dspy-rs adapter/core modules are temporary pass-throughs while later extractions remove the old crate entirely. Verification: cargo check --workspace; cargo test -p dsrs-lm; cargo test --workspace --no-run. Scaffolding: ChatAdapter docs still mention Predict conceptually, but no code dependency on dsrs-predict is introduced.
Moves evaluator, MetricOutcome/TypedMetric, feedback metrics, execution traces, and feedback helper functions into dsrs-evaluate. Boundary check: dsrs-evaluate depends only on dsrs-core plus serde/anyhow. The typed Example<S> move from the LM extraction lets evaluation stay independent of dsrs-predict and LM, matching the design's permanent pure metric surface. Fixed the moved doctest import from dspy_rs to dsrs_evaluate and cleared a rebuild-only target/ disk-full blocker before rerunning verification. Verification: cargo check --workspace; cargo test -p dsrs-evaluate; cargo test --workspace --no-run. Scaffolding: dspy-rs::evaluate is a temporary pass-through until the final dspy-rs deletion.
Predict, ChainOfThought, and ReAct now live in dsrs-predict instead of the dspy-rs source tree. The old crate keeps only pass-through module exports for this intermediate checkpoint so the rest of the workspace can keep compiling while tests and examples are redistributed later. The obvious split hit the macro runtime first: derives inside dsrs-predict could not resolve dspy-rs because the new crate rightly does not depend on the old facade. This commit cuts dsrs-macros over to dsrs-core paths and moves macro support exports into dsrs-core, so generated Signature/Augmentation impls no longer require the aggregator. Verification: - cargo check --workspace - cargo test -p dsrs-predict - cargo test --workspace --no-run Scaffolding: dspy-rs still has temporary re-export modules; the final hard cutover deletes that crate after GEPA/data/tests/examples move.
GEPA and the Pareto frontier now live in dsrs-gepa. COPRO and MIPROv2 source files are deleted instead of kept behind compatibility exports, matching the hard-cutover design. The non-obvious break was predictor discovery: the Facet walker still recognized only the old dspy_rs::predictors::predict::Predict shape. Updating that identity to dsrs_predict::predict makes GEPA usable after the split, and the extracted GEPA test now exercises state restoration through the new crate boundary. Deleted optimizer-only coverage and examples: - test_miprov2.rs - test_optimizer_named_parameters_integration.rs - test_optimizer_typed_metric.rs - examples/02-module-iteration-and-updation.rs - examples/04-optimize-hotpotqa.rs - examples/08-optimize-mipro.rs - examples/94-smoke-slice5-optimizer-interface.rs Verification: - cargo test -p dsrs-gepa - cargo check --workspace - cargo test --workspace --no-run Scaffolding: dspy-rs still has a temporary optimizer.rs re-export for GEPA while the final tests/examples relocation and crate deletion are pending.
DataLoader, typed row mapping, JSONL serialization, and URL detection now live in dsrs-data. The new crate owns the heavy data stack through explicit features so light users do not pay for CSV/Parquet/HuggingFace unless they opt in. The old dspy-rs data module is reduced to a temporary pass-through, because final test/example relocation still needs a compileable bridge before the aggregator crate is deleted. Verification: - cargo check -p dsrs-data --features all - cargo check --workspace - cargo test -p dsrs-data --features all - cargo test --workspace --no-run Scaffolding: feature gates are in place, but tests still live under dspy-rs until the final hard cutover redistributes them.
Adds dsrs-leaven modules for artifact, change, surface, evaluator/problem, and evidence against the current local leaven crates. The bodies intentionally stay unimplemented; the value here is signature pressure against leaven-core/leaven-surface/leaven-engine/leaven-evidence while the real leaven optimizer path is still pending. The plan sketch expected older trait shapes. Current leaven has Artifact::ApplyError, EditSurface as a separate capability, and OptimizationProblem as the evaluator binding point, so the scaffold follows those actual APIs instead of stale names. Verification: - cargo check -p dsrs-leaven - cargo check --workspace Scaffolding: no runtime bridge yet; this is only a compile-time contract for the follow-up leaven implementation.
The split crates now own their tests and examples directly, so the old dspy-rs shell is gone instead of staying as a facade. cargo metadata no longer reports dspy-rs, and cargo check --workspace passes against the new crate graph.
Moved the remaining example programs to their owning crates and rewired imports to dsrs-core, dsrs-lm, dsrs-predict, dsrs-evaluate, dsrs-gepa, dsrs-data, and dsrs-trace. The missing telemetry helper now lives in dsrs-trace because examples and runtime tracing need one canonical owner.
Tried leaving dsrs-data with narrow defaults, but cargo check --workspace exposed that the current dataloader module still compiles every loader path together. This snapshot defaults dsrs-data to all loader features so the workspace gate is honest; scaffolding remains to split the dataloader module behind per-format cfgs later.
Verification: cargo test -p dsrs-predict --no-run; cargo test -p dsrs-evaluate --no-run; cargo test -p dsrs-data --features all --no-run; cargo test -p dsrs-gepa --no-run; cargo test -p dsrs_macros --no-run; cargo check -p dsrs-{predict,evaluate,gepa,lm,trace} --examples; cargo metadata --format-version 1 --no-deps | rg '"name":"dspy-rs"|dspy-rs' (no matches); cargo check --workspace.
After dissolving dspy-rs, the low-coverage split crates had no durable way to measure progress independently. A monolithic cargo llvm-cov workspace report also proved fragile: branch report generation can segfault in llvm-cov, and full branch instrumentation on this Mac hit disk pressure when run across data/leaven dependencies. This adds tools/coverage-runtime.mjs. It records line coverage per split crate, supports --package for focused runs, and makes branch coverage explicit via --branch / --strict-branch instead of hiding reporter failures. The harness writes JSON summaries plus target/llvm-cov/runtime-coverage-summary.md. Coverage evidence from this slice: - full line run completed: dsrs-cache 43/43 lines, dsrs-trace 170/272 lines, dsrs-leaven 48/81 lines; the prior baseline for those crates was 0-line coverage. - focused branch run completed for dsrs-core: 22/77 branches, 28.57%. Test harness improvements: - cache insertion/history/noop channel tests. - trace graph/context/value tests. - leaven scaffold contract and serde payload tests. - removed obsolete macro compile-fail expectation that DynPredictor is private; dsrs-core now deliberately exports it. Hard cutover cleanup: - bamltype-derive no longer falls back to dspy-rs macro support. - active README/Mintlify docs now teach split crate imports. - COPRO/MIPROv2 docs are deleted from active navigation because those optimizers were deleted with the split. Verification: - cargo fmt - node --check tools/coverage-runtime.mjs - cargo test -p dsrs-cache - cargo test -p dsrs-trace - cargo test -p dsrs-leaven - cargo test -p bamltype -p bamltype-derive -p dsrs_macros --test test_public_api_compile_fail --test test_bamltype_attr_contract --test test_field_macro - tools/coverage-runtime.mjs - tools/coverage-runtime.mjs --branch --package dsrs-core - cargo test --workspace --no-run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
dspy-rsruntime into layer-aligned crates:dsrs-core,dsrs-trace,dsrs-cache,dsrs-lm,dsrs-evaluate,dsrs-predict,dsrs-gepa,dsrs-data, anddsrs-leavendspy-rsfacade, remove COPRO/MIPROv2 active surfaces, and update active README/Mintlify docs to the new crate layouttools/coverage-runtime.mjsfor per-crate line coverage plus opt-in branch coverage, and add targeted tests for cache, trace, and leaven scaffoldingVerification
cargo fmtnode --check tools/coverage-runtime.mjscargo test --workspace --no-runcargo test --workspacecargo build --release --workspacetools/coverage-runtime.mjstools/coverage-runtime.mjs --branch --package dsrs-coreCoverage Notes
dsrs-cache: 43/43 lines, 100%dsrs-trace: 170/272 lines, 62.50%dsrs-leaven: 48/81 lines, 59.26%dsrs-core22/77 branches, 28.57%Companion Leaven Cleanup
The DSRs
dsrs-leavencrate depends on the sibling live Leaven checkout. I also created a local Leaven commituqyunwtl fa37a15d chore: drop leaven-dsrs bridge crateto remove the stale Leaven-owned bridge skeleton; that is not part of this PR because it lives in the sibling repo.