diff --git a/AGENTS.md b/AGENTS.md index ac85d79ea..80645ade8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,113 +1,84 @@ # Repository Guidelines -## Project Structure & Module Organization - -This is a Rust workspace with JS/WASM packaging around the core CRDT library. -Key crates live under `crates/`: `loro` is the public Rust API, `loro-internal` -contains core CRDT logic, `loro-wasm` exposes the WASM/TypeScript package, and -`delta`, `rle`, `kv-store`, and `fractional_index` hold shared primitives. -Integration and regression tests are mostly in `crates/loro/tests` and -`crates/loro-internal/tests`; WASM tests and package files are in -`crates/loro-wasm`. Examples live in `examples/` and `crates/examples`. - -## Build, Test, and Development Commands - -- `cargo build`: build the Rust workspace. -- `cargo check -p loro-internal`: quickly validate core internals. -- `cargo test -p loro-internal --doc`: run Rust doctests for internal APIs. -- `pnpm test`: run the main Rust test suite via nextest plus doctests. -- `pnpm check`: run clippy with all features and deny warnings. -- `pnpm release-wasm`: sync versions and build the release WASM package. -- `pnpm test-loom`: run loom concurrency tests for `crates/loro/tests/multi_thread_test.rs`. - -## Coding Style & Naming Conventions - -Use standard Rust formatting with `rustfmt`; keep imports and chained calls formatted -by the tool. Prefer explicit, small APIs and existing crate-local helpers over new -abstractions. Rust items use `snake_case` for functions/modules and `CamelCase` for -types. JS/TS bindings in `loro-wasm` should preserve the established exported API -names used by tests and docs. - -## Testing Guidelines - -Add regression tests near the behavior being fixed: Rust API tests in -`crates/loro/tests`, internal tests in `crates/loro-internal/tests` or module tests, -and WASM behavior in `crates/loro-wasm/tests`. For import/encoding bugs, prefer -fixture-based tests with small binary fixtures. Run the narrow package test first, -then `pnpm test` when the change affects shared behavior. For changes touching -internal diff calculation, checkout, import, or state-replay logic, also consider -the fuzz targets in `crates/fuzz`; ask whether to run the broader `fuzz all` -target before spending the extra time. - -## Commit & Pull Request Guidelines - -History uses short imperative commits, often prefixed by scope such as `fix:`, -`test:`, `chore:`, or `refactor:`. Keep commits focused and include fixtures or -tests with fixes. PRs should describe what changed, why, validation commands, and -linked issues or production traces when relevant. Add a changeset when publishing -behavior or package output changes. - -## Agent-Specific Notes - -### Principle: Avoid Breaking Changes Unless Absolutely Necessary - -The `loro` crate is a public library with downstream users. When fixing panics or bugs, -prefer non-breaking solutions: - -- Add `try_*` methods that return `Option` or `Result` instead of changing existing - method signatures. -- Replace `assert!` / `unwrap()` / `unreachable!()` with descriptive `expect()` messages - when the method must remain panicking for backward compatibility. -- Only introduce breaking signature changes (e.g., changing a return type from `T` to - `Option`) when there is no safe backward-compatible alternative and the breakage - is justified by a critical correctness or safety issue. - -### Principle: Internal Invariant Preservation Over Graceful Degradation - -When an internal invariant is violated (e.g., a state lookup that should always succeed -returns `None`, an event batch has an unexpected structure, or a diff cannot be composed), -the priority is: - -1. **Do not let the system continue in a corrupted or inconsistent state.** - Prefer `panic!` / `unwrap()` / `expect()` over silently skipping, returning a default, - or returning success when the internal state is known to be wrong. -2. **Preserve the correctness of public API contracts.** - A public method should not return a value that violates its documented contract - (e.g., returning an empty list when nodes actually exist). -3. **Avoid panics on valid user input.** - Malformed external input (decode errors, invalid JSON schema, out-of-bounds indices) - should return `Err`. But do not replace internal-safety panics with silent skips - just to avoid crashing. - -In short: internal corruption → fail-fast (panic); invalid user input → `Result::Err`; -returning wrong data is worse than panicking. - -### Invariant: Flush Pending Events In `loro-wasm` - -In `crates/loro-wasm/src/lib.rs`, subscription callbacks (`subscribe*`, -container `subscribe`, etc.) do not call user JS immediately. The binding -enqueues JS calls into a global pending queue and schedules a microtask check. -If the microtask runs before `callPendingEvents()` flushes the queue, it logs: - -- `[LORO_INTERNAL_ERROR] Event not called` - -Any WASM-exposed API that can enqueue subscription events must flush pending -events before returning control to JS. To avoid adding overhead to every op, only -a small JS-side allowlist is wrapped; the wrapper calls `callPendingEvents()` in -a `finally` block. - -When adding or changing a `#[wasm_bindgen]` API in `crates/loro-wasm/src/lib.rs` -that can mutate document state, check whether it can trigger an implicit commit -or barrier (`commit`, `with_barrier`, `implicit_commit_then_stop`), emit events -(`emit_events`), or apply diffs (`revertTo`, `applyDiff`). If so, add its JS -name to the allowlist near the bottom of `crates/loro-wasm/index.ts`: -`decorateMethods(LoroDoc.prototype, [...])` or the relevant prototype allowlist. -Pure read/query APIs should not be decorated. - -Quick check with active subscriptions (`doc.subscribe(...)` or container -`subscribe(...)`): mutating APIs should not produce the error above. A useful -local check is: - -```sh -pnpm -C crates/loro-wasm build-release -``` +## Project Snapshot + +Loro is a Rust CRDT workspace with JS/WASM packaging and a MoonBit codec. + +- `crates/loro`: public Rust API; avoid breaking downstream users. +- `crates/loro-internal`: core CRDT logic. Read its + [AGENTS.md](crates/loro-internal/AGENTS.md) before changing import/export, + encoding, state, diff, checkout, or replay behavior. +- `crates/loro-wasm`: `loro-crdt` WASM/TypeScript package. Read its + [AGENTS.md](crates/loro-wasm/AGENTS.md) before changing bindings, exports, + wrappers, or build scripts. +- `crates/delta`, `crates/rle`, `crates/kv-store`, `crates/fractional_index`, + `crates/loro-common`, and `packages/fractional-index`: shared primitives and + packages. +- `moon/`: MoonBit Loro binary codec; use [skills/moonbit/SKILL.md](skills/moonbit/SKILL.md). + +## Context Index + +- Encoding/import/export modes, current vs outdated formats, shallow snapshots: + [context/internal-encoding.md](context/internal-encoding.md). +- Mergeable container model, marker/cid rules, tests, and common pitfalls: + [context/mergeable-containers.md](context/mergeable-containers.md). +- User-facing Loro usage, sync, editor integration, and performance guidance: + [skills/loro/SKILL.md](skills/loro/SKILL.md). +- Context backlog: [context/CONTEXT-GAPS.md](context/CONTEXT-GAPS.md). + +## Commands + +- JS deps: `pnpm install --frozen-lockfile`. +- Rust build/check/format/lint: `cargo build`, `cargo check -p loro-internal`, + `cargo fmt --all`, `pnpm check`. +- Rust tests: `pnpm test`; internal doctests: `cargo test -p loro-internal --doc`. +- Loom: `pnpm test-loom`. +- WASM: `pnpm release-wasm`, or `pnpm -C crates/loro-wasm build-dev`. +- Bundlers after WASM packaging changes: `pnpm test-bundlers`; browser runtime: + `pnpm --dir examples/bundler-smoke-tests run test:browser`. +- Fractional-index TS: `pnpm test-fractional-index`. +- Fuzz smoke: `pnpm run-fuzz-corpus`. +- MoonBit codec, when `moon` is available: from `moon/`, run `moon check`, + `moon test`, `moon fmt`. + +Use narrow checks first. Ask before broad fuzzing or long browser matrices. + +## Working Rules + +- Start with `git status --short --branch`; treat uncommitted changes as user + work unless you made them. +- Before editing, read every `AGENTS.md` from root to target directory. Keep + `CLAUDE.md` as a symlink to the nearest `AGENTS.md`. +- Use `rg` / `rg --files` for search. +- Public API changes in `loro` or `loro-crdt` should be backward-compatible when + possible. Prefer new `try_*` APIs over breaking signatures. +- Internal corruption should fail fast; invalid external input should return + `Err`. Returning wrong state is worse than panicking on an impossible internal + invariant. +- Add regression tests near behavior: `crates/loro/tests`, + `crates/loro-internal/tests`, module tests, `crates/loro-wasm/tests`, or + `moon/loro_codec/*_test.mbt`. +- Add a changeset for publishing behavior or package output changes. +- Do not hand-edit generated WASM package output; regenerate it with package + scripts. + +## Self-Maintained Agent Context + +- Treat "why was that hard to find?" as a context bug. Add a nearby + `AGENTS.md` pointer or a `context/` article, or append a line to + [context/CONTEXT-GAPS.md](context/CONTEXT-GAPS.md). +- Keep root context short. If an `AGENTS.md` grows past about 4000 characters, move + detail into a linked `context/` article. +- Header context articles with `Verified against code YYYY-MM-DD`, anchor claims + to files/symbols, and link them from root plus the nearest per-directory + `AGENTS.md`. +- If code changes make an `AGENTS.md` or context article stale, update the docs + in the same change. +- When a commit needs non-obvious rationale, land that rationale in the nearest + context file and keep the commit message as a pointer. + +## Commit And PR Notes + +History uses short imperative commits, often prefixed by `fix:`, `test:`, +`chore:`, or `refactor:`. PRs should include summary, rationale, validation, and +linked issues or traces when relevant. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/context/CONTEXT-GAPS.md b/context/CONTEXT-GAPS.md new file mode 100644 index 000000000..9b92d88f0 --- /dev/null +++ b/context/CONTEXT-GAPS.md @@ -0,0 +1,7 @@ +# Context Discoverability Gaps (backlog) + +Append a line when you discovered something important the hard way but could not +fix the docs in that change. + +Format: +`YYYY-MM-DD | | | why it was hard | suggested home` diff --git a/context/internal-encoding.md b/context/internal-encoding.md new file mode 100644 index 000000000..65ecb0975 --- /dev/null +++ b/context/internal-encoding.md @@ -0,0 +1,142 @@ +# Internal Encoding Context + +Verified against code 2026-06-16. + +Loro has one binary blob envelope, two current binary body formats, two +recognized-but-unsupported legacy top-level modes, and a separate JSON updates +schema. The most common mistake is to treat `outdated_encode_reordered.rs` as an +obsolete file; only top-level blob modes 1 and 2 are obsolete. Several helpers in +that file are still used by current fast paths. + +## Two-Hop Answer + +If an agent asks "how does Loro encoding work?", start here: + +- [crates/loro-internal/src/encoding.rs](../crates/loro-internal/src/encoding.rs): + `ExportMode`, `EncodeMode`, `parse_header_and_body`, `encode_with`, + `decode_oplog_changes`, `decode_snapshot`, `decode_import_blob_meta`. +- [crates/loro-internal/src/loro.rs](../crates/loro-internal/src/loro.rs): + `LoroDoc::_import_with` chooses snapshot-vs-updates application behavior. +- [crates/loro-internal/src/encoding/fast_snapshot.rs](../crates/loro-internal/src/encoding/fast_snapshot.rs): + `Snapshot`, `encode_snapshot_inner`, `decode_snapshot_inner`, `encode_updates`, + `decode_updates`. +- [crates/loro-internal/src/encoding/shallow_snapshot.rs](../crates/loro-internal/src/encoding/shallow_snapshot.rs): + `export_shallow_snapshot_inner`, `export_state_only_snapshot`, + `encode_snapshot_at`. +- [crates/loro-internal/src/encoding/json_schema.rs](../crates/loro-internal/src/encoding/json_schema.rs): + `JsonSchema`, `export_json`, `decode_changes`, `redact`. +- [docs/encoding.md](../docs/encoding.md) and + [docs/encoding-container-states.md](../docs/encoding-container-states.md): + external binary format references. Verify against code before changing them. + +## Binary Envelope + +Every binary export starts with: + +- magic bytes `loro` from `encoding.rs:MAGIC_BYTES`, +- a 16-byte checksum field, +- a big-endian `u16` `EncodeMode`, +- mode-specific body bytes. + +For current `FastSnapshot` and `FastUpdates` blobs, `ParsedHeaderAndBody::check_checksum` +uses `xxhash32` over bytes starting at offset 20, which includes the mode bytes +and body. Legacy modes use the older MD5 check path only for detection. + +## Supported And Outdated Modes + +Current modes: + +- `EncodeMode::FastSnapshot = 3`: used by `ExportMode::Snapshot`, + `ShallowSnapshot`, `StateOnly`, and `SnapshotAt`. +- `EncodeMode::FastUpdates = 4`: used by `ExportMode::Updates` and + `UpdatesInRange`. + +Recognized but unsupported top-level modes: + +- `EncodeMode::OutdatedRle = 1` +- `EncodeMode::OutdatedSnapshot = 2` + +`encoding.rs:decode_oplog_changes`, `encoding.rs:decode_snapshot`, and +`LoroDoc::decode_import_blob_meta` return `ImportUnsupportedEncodingMode` for +these outdated top-level modes. Do not extend them without compatibility +fixtures and a migration plan. + +Important nuance: [outdated_encode_reordered.rs](../crates/loro-internal/src/encoding/outdated_encode_reordered.rs) +still contains current helpers including `import_changes_to_oplog`, `encode_op`, +`decode_op`, and `ValueRegister`. + +## FastSnapshot + +`fast_snapshot.rs:Snapshot` has three body sections: + +1. `oplog_bytes`: KV-store encoded change history. +2. `state_bytes`: KV-store encoded materialized state, or `EMPTY_MARK` when + omitted and state must be recalculated. +3. `shallow_root_state_bytes`: KV-store encoded shallow root state; empty for a + non-shallow snapshot. + +`decode_snapshot_inner` only initializes directly when importing into an empty +document. If a snapshot is imported into a non-empty document, +`LoroDoc::_import_with` routes through decoded oplog changes instead. Failed +direct snapshot import must reset both state and oplog. + +## FastUpdates + +`FastUpdates` is a sequence of LEB128 length-prefixed change blocks. +`fast_snapshot.rs:decode_updates` rejects invalid block lengths, length +overflow, and truncated block payloads, then sorts decoded changes by lamport. +`encoding.rs:apply_decoded_changes_to_oplog` imports changes, separates pending +changes, applies newly-unlocked pending changes, and rejects dependencies before +a shallow root. + +## Shallow, State-Only, And SnapshotAt + +All three use `FastSnapshot` mode: + +- `ShallowSnapshot` retains history since a calculated shallow start frontier. +- `StateOnly` is a shallow snapshot with minimal history at the target version. +- `SnapshotAt` exports full history up to target frontiers plus state at that + version. + +`shallow_snapshot.rs` temporarily checks out versions and must restore the +document's original state and attached/detached status. It must not split rich +text style start/end ops across the shallow root. Unknown container types block +shallow/state snapshot export through `LoroEncodeError::UnknownContainer`. + +Pre-shallow frontier safety lives in `loro.rs`: `checkout`, `diff`, and +`revert_to` must return `SwitchToVersionBeforeShallowRoot` instead of traversing +history before the shallow root. + +## JSON Updates + +`json_schema.rs` is not wrapped in the binary `loro` envelope. Its +`JsonSchema` carries: + +- `schema_version = 1`, +- `start_version`, +- optional peer compression table, +- JSON changes and ops. + +Malformed JSON schema should return `Err` without partial import. Look at +[crates/loro-internal/src/tests/import_atomicity.rs](../crates/loro-internal/src/tests/import_atomicity.rs) +when changing JSON import validation or rollback behavior. + +## Validation Shortcuts + +- Binary malformed input or rollback: `cargo test -p loro-internal import_atomicity` +- Truncated fast updates: `cargo test -p loro-internal decode_updates_rejects_truncated_block` +- Pre-shallow checkout/diff/revert behavior: + `cargo test -p loro --test issue issue_928` and + `cargo test -p loro --test contracts shallow` +- Snapshot retention that might involve mergeable containers: + `cargo test -p loro-internal --test mergeable_container` +- Shared behavior: root `pnpm test` + +## Common Misconceptions + +- "Outdated modes are still supported because `LoroDoc::_import_with` branches on + them." They are detected, then route to decode paths that return unsupported. +- "`outdated_encode_reordered.rs` is dead." It is legacy-named but still contains + active op/value helpers. +- "Snapshot import always initializes state directly." Only empty docs can reset + from snapshot; non-empty imports use oplog-change application. diff --git a/context/mergeable-containers.md b/context/mergeable-containers.md new file mode 100644 index 000000000..f98f6b69b --- /dev/null +++ b/context/mergeable-containers.md @@ -0,0 +1,121 @@ +# Mergeable Container Context + +Verified against code 2026-06-16. + +Mergeable containers let two peers independently create the same child container +under a map key and converge to one deterministic container id. The source of +truth for visibility is a binary marker in the parent map slot, not whether the +child already has direct operations. + +## Two-Hop Answer + +If an agent asks "how do mergeable containers work?", start here: + +- [crates/loro-common/src/lib.rs](../crates/loro-common/src/lib.rs): + `MERGEABLE_NAMESPACE_PREFIX`, `ContainerID::new_mergeable`, + `ContainerID::parse_mergeable`, `mergeable_marker`, + `parse_mergeable_marker`, `translate_mergeable_marker_value`. +- [crates/loro-internal/src/handler.rs](../crates/loro-internal/src/handler.rs): + `MapHandler::ensure_mergeable_container` and public + `ensure_mergeable_*` helpers. +- [crates/loro-internal/src/state/mergeable.rs](../crates/loro-internal/src/state/mergeable.rs): + logical child edge resolution from deterministic cid plus parent marker. +- [crates/loro-internal/src/state/map_state.rs](../crates/loro-internal/src/state/map_state.rs) + and [crates/loro-internal/src/txn.rs](../crates/loro-internal/src/txn.rs): + marker-to-container translation at read, diff, and event boundaries. +- [crates/loro-internal/docs/mergeable-container-id.md](../crates/loro-internal/docs/mergeable-container-id.md): + current mergeable cid encoding. +- [crates/loro-internal/tests/mergeable_container/](../crates/loro-internal/tests/mergeable_container/) + and [crates/loro-internal/tests/mergeable_cid_encoding.rs](../crates/loro-internal/tests/mergeable_cid_encoding.rs): + regression coverage. + +## Model + +`MapHandler::ensure_mergeable_(key)` does two things: + +1. Derives a deterministic `ContainerID::Root` with + `ContainerID::new_mergeable(parent, key, kind)`. +2. Writes `mergeable_marker(parent, key, kind)` into the parent map slot. + +The deterministic cid uses the reserved `🤝:` namespace. Its payload encodes the +nearest non-mergeable map ancestor and escaped key path. The child kind is stored +in `ContainerID::Root.container_type`, not duplicated in the root-name payload. + +The marker is compact binary storage: + +- magic bytes from `MERGEABLE_MARKER_MAGIC`, +- one byte for container kind, +- a 24-bit digest bound to `(parent, key, kind)`. + +Copying marker bytes to another key or parent does not activate a mergeable child +there. + +## Visibility And Conflicts + +The parent map's current value decides visibility: + +- no marker: child is hidden, though state may still exist at its deterministic cid; +- same-kind marker: child is active and read surfaces translate it to + `LoroValue::Container`; +- different-kind marker: parent map LWW picks the visible kind. + +Concurrent same-kind creation writes identical markers and merges into the same +child. Concurrent different-kind creation writes different markers; regular map +LWW chooses one visible kind. Losing-kind state must remain addressable by +deterministic cid and can resurface if a later `ensure_mergeable_` +rewrites the marker. + +## Boundaries + +- User strings, arbitrary binary values, scalars, and regular child containers + are not mergeable markers. `ensure_mergeable_*` must return `ArgErr` rather + than overwrite them. +- Repeating same-kind `ensure_mergeable_*` over the same marker is idempotent and + should not emit another op. +- Calling a different-kind `ensure_mergeable_*` over an existing mergeable marker + is a deliberate local kind change. +- Deleting the map key clears the marker and hides the child; re-ensuring writes + a new marker and resurfaces preserved state. +- Detached map handlers cannot ensure mergeable children, because the + deterministic child cid depends on the attached parent cid. + +## Snapshot And Retention Rules + +Snapshot and shallow snapshot alive-container walks must preserve mergeable child +state even when that child is hidden by a different winning marker. This is +covered by `tests/mergeable_container/snapshot.rs`, including shallow snapshot +tests for losing-kind state. + +Raw marker bytes are the wire/storage representation. Public read and diff +surfaces should translate an active marker to a container value. APIs that expose +raw/shallow storage may still show the binary marker for forward compatibility. + +## Tests By Question + +- Deterministic cid and malformed parser cases: + `cargo test -p loro-internal --test mergeable_cid_encoding` +- Marker layout, idempotency, kind changes, and non-mergeable occupant guards: + `cargo test -p loro-internal --test mergeable_container discriminator` +- Same-kind convergence and nested chains: + `cargo test -p loro-internal --test mergeable_container convergence` +- Delete/hide/reactivate behavior: + `cargo test -p loro-internal --test mergeable_container delete` +- Different-kind conflicts: + `cargo test -p loro-internal --test mergeable_container type_conflict` +- Snapshot and shallow snapshot retention: + `cargo test -p loro-internal --test mergeable_container snapshot` +- Pending import ordering: + `cargo test -p loro-internal --test mergeable_container pending` +- Events and paths: + `cargo test -p loro-internal --test mergeable_container events_and_paths` + +## Common Misconceptions + +- "A mergeable child is visible once it has ops." False; visibility is controlled + by the parent marker. +- "Deleting the key deletes the child state." False; it hides the child by + removing the marker. +- "Kind conflict discards the loser." False; the loser is hidden but should stay + recoverable by deterministic cid. +- "The marker is the child cid." False; the marker activates a kind at a + `(parent, key)`, while the cid is derived independently. diff --git a/crates/loro-internal/AGENTS.md b/crates/loro-internal/AGENTS.md new file mode 100644 index 000000000..a4c7af4f0 --- /dev/null +++ b/crates/loro-internal/AGENTS.md @@ -0,0 +1,61 @@ +# loro-internal Guidelines + +This crate contains Loro's unstable internal CRDT implementation. Public API +compatibility concerns still matter because `crates/loro` and `crates/loro-wasm` +wrap this crate directly, but the internal priority is preserving invariants +over graceful degradation. + +## Internal Map + +- `src/loro.rs`: document-level orchestration for commit, import/export, + checkout, barriers, state/oplog coordination, and event emission. +- `src/encoding.rs`: public/internal `ExportMode`, binary header parsing, + checksum verification, `EncodeMode` dispatch, import metadata, and the bridge + from decoded changes into `OpLog`. +- `src/encoding/`: concrete binary and JSON encoding implementations. Read + `src/encoding/AGENTS.md` and + [../../context/internal-encoding.md](../../context/internal-encoding.md) + before changing binary layout, JSON schema, import metadata, shallow snapshot, + or op/value encoding. +- `src/oplog/` and `src/dag/`: change storage, dependency ordering, pending + changes, version vectors/frontiers, shallow roots, and history traversal. +- `src/state.rs` and `src/state/`: materialized document state, container stores, + diff application, checkout/replay, deep value, dead-container tracking, and + mergeable container visibility. Read `src/state/AGENTS.md` and + [../../context/mergeable-containers.md](../../context/mergeable-containers.md) + before changing mergeable containers. +- `src/handler.rs`: typed container handlers, local operation creation, and + `MapHandler::ensure_mergeable_*`. +- `src/diff_calc/`: diff calculation when moving between versions. +- `docs/diff_calc.md`: design notes for diff calculation. +- `docs/mergeable-container-id.md`: current mergeable container id encoding. +- `tests/mergeable_container/` and `tests/mergeable_cid_encoding.rs`: focused + mergeable container regression tests. +- `src/tests/import_atomicity.rs`: import rollback and malformed-input + regressions. + +## Commands + +Use narrow checks first: + +- `cargo check -p loro-internal` +- `cargo test -p loro-internal --doc` +- `cargo test -p loro-internal --test mergeable_container` +- `cargo test -p loro-internal --test mergeable_cid_encoding` +- `cargo test -p loro-internal import_atomicity` + +For broad shared behavior, run the root commands from `AGENTS.md`. For changes +to import, checkout, encoding, state replay, or diff calculation, consider fuzz +coverage under `crates/fuzz` and ask before running long fuzz targets. + +## Working Rules + +- Internal invariant violation should fail fast. Invalid external bytes or JSON + should return `Err`. +- Do not silently skip ops, containers, state entries, diffs, or pending changes. +- Snapshot/import paths must be atomic: if decode or state application fails, + rollback must leave the document usable. +- Preserve attached/detached document state when export paths temporarily + checkout another version. +- If a change affects `crates/loro` or `crates/loro-wasm` behavior, add or update + tests at the wrapper layer as well as the internal layer when practical. diff --git a/crates/loro-internal/CLAUDE.md b/crates/loro-internal/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/crates/loro-internal/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/crates/loro-internal/src/encoding/AGENTS.md b/crates/loro-internal/src/encoding/AGENTS.md new file mode 100644 index 000000000..5a50ce70d --- /dev/null +++ b/crates/loro-internal/src/encoding/AGENTS.md @@ -0,0 +1,38 @@ +# Encoding Guidelines + +This module owns Loro import/export formats. Read +[../../../../context/internal-encoding.md](../../../../context/internal-encoding.md) +for the verified map of supported modes, outdated modes, shallow snapshots, JSON +schema, and validation entry points. + +## Local Entry Points + +- `../encoding.rs`: `ExportMode`, `EncodeMode`, 22-byte `loro` header, checksum + validation, top-level dispatch, and `decode_import_blob_meta`. +- `fast_snapshot.rs`: current `FastSnapshot` and `FastUpdates` body layouts. +- `shallow_snapshot.rs`: `ShallowSnapshot`, `StateOnly`, and `SnapshotAt`. +- `json_schema.rs`: JSON updates, peer compression, validation, import/export, + and redaction. +- `outdated_encode_reordered.rs`: legacy-named op/value columnar helpers still + used by current fast paths; do not confuse this with unsupported top-level + outdated blob modes. +- `value.rs`, `value_register.rs`, `arena.rs`: op/value encoding support. + +## Rules + +- Current binary modes are `FastSnapshot = 3` and `FastUpdates = 4`. +- Top-level `OutdatedRle = 1` and `OutdatedSnapshot = 2` are compatibility + detections, not formats to extend. +- Malformed bytes or JSON schema should return `Err`, not partially import. +- Snapshot import/export must preserve rollback and attached/detached state + invariants. +- Unknown container types must block shallow/state snapshot export rather than + producing a blob that cannot be decoded correctly. + +## Validation + +- `cargo test -p loro-internal import_atomicity` +- `cargo test -p loro-internal decode_updates_rejects_truncated_block` +- `cargo test -p loro-internal --test mergeable_container` when snapshot changes + can affect mergeable child retention. +- Root `pnpm test` for shared import/export semantic changes. diff --git a/crates/loro-internal/src/encoding/CLAUDE.md b/crates/loro-internal/src/encoding/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/crates/loro-internal/src/encoding/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/crates/loro-internal/src/state/AGENTS.md b/crates/loro-internal/src/state/AGENTS.md new file mode 100644 index 000000000..5c653e67a --- /dev/null +++ b/crates/loro-internal/src/state/AGENTS.md @@ -0,0 +1,40 @@ +# State Guidelines + +This module owns materialized document state, container stores, diff application, +checkout/replay behavior, deep/shallow values, and mergeable container +visibility. Read +[../../../../context/mergeable-containers.md](../../../../context/mergeable-containers.md) +before changing mergeable child behavior. + +## Local Entry Points + +- `../state.rs`: `DocState`, checkout/path/deep-value traversal, state replay, + lifecycle, and alive-container discovery. +- `container_store/`: persisted KV-backed container snapshots and + `ContainerWrapper` encoding. +- `map_state.rs`, `list_state.rs`, `richtext_state.rs`, `tree_state.rs`, + `movable_list_state.rs`, `counter_state.rs`: per-container state and snapshot + codecs. +- `mergeable.rs`: logical child edge resolution for mergeable containers. +- `dead_containers_cache.rs`: dead/alive tracking and marker-driven mergeable + reactivation. +- `unknown_state.rs` and `../diff_calc/unknown.rs`: forward compatibility for + unknown container types. + +## Mergeable Rules + +- `MapHandler::ensure_mergeable_*` writes a compact marker into the parent map + and returns a handler for a deterministic `ContainerID`. +- The parent map marker, not "child has ops", decides whether a mergeable child + is visible. +- Non-mergeable occupants must block `ensure_mergeable_*`; same-kind marker + writes are idempotent; different-kind marker writes are deliberate kind + changes. +- Snapshot and shallow snapshot retention must preserve hidden losing-kind + mergeable state. + +## Validation + +- `cargo test -p loro-internal --test mergeable_cid_encoding` +- `cargo test -p loro-internal --test mergeable_container` +- `cargo test -p loro-internal import_atomicity` if import or rollback is involved. diff --git a/crates/loro-internal/src/state/CLAUDE.md b/crates/loro-internal/src/state/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/crates/loro-internal/src/state/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/crates/loro-wasm/AGENTS.md b/crates/loro-wasm/AGENTS.md index 81e5a1bf4..287a8b51d 100644 --- a/crates/loro-wasm/AGENTS.md +++ b/crates/loro-wasm/AGENTS.md @@ -1 +1,70 @@ -If you change WASM packaging or bundler-facing entrypoints, run `pnpm --dir examples/bundler-smoke-tests run test:browser` from the repo root to verify the package still builds and executes in real browsers. +# WASM Package Guidelines + +This subtree builds the `loro-crdt` JS/WASM package. It contains the Rust +`#[wasm_bindgen]` bindings in `src/lib.rs`, the JS/TS wrapper in `index.ts`, +Rollup/build scripts, package exports, and WASM-specific tests. + +## Commands + +Run commands from the repository root unless noted. + +- Build/test release package: `pnpm release-wasm`. +- Local dev package build: `pnpm -C crates/loro-wasm build-dev`. +- Package-local release build: `pnpm -C crates/loro-wasm build-release`. +- Package-local tests after an existing build: `pnpm -C crates/loro-wasm test`. +- Fast bundler smoke tests after entrypoint/export changes: + `pnpm test-bundlers`. +- Browser runtime smoke tests for packaging changes: + `pnpm --dir examples/bundler-smoke-tests run test:browser`. + +`pnpm release-wasm` runs version sync, installs this package's dependencies, and +builds the release artifacts. Use it for final validation when changing +`src/lib.rs`, `index.ts`, package exports, Rollup config, or build scripts. + +## Pending Events Invariant + +Subscription callbacks (`subscribe*`, container `subscribe`, and related APIs) +do not call user JS immediately. Rust queues JS calls into a global pending queue +and schedules a microtask check. If the microtask runs before +`callPendingEvents()` flushes the queue, the package logs: + +```text +[LORO_INTERNAL_ERROR] Event not called +``` + +Any WASM-exposed API that can enqueue subscription events must flush pending +events before returning control to JS. This is intentionally implemented as a +small JS-side allowlist in `index.ts` rather than wrapping every method. + +When adding or changing a `#[wasm_bindgen]` API in `src/lib.rs`, check whether it +can: + +- mutate document or container state, +- trigger an implicit commit or barrier (`commit`, `with_barrier`, + `implicit_commit_then_stop`), +- emit events, +- apply diffs (`revertTo`, `applyDiff`), or +- change ephemeral store state that has JS subscribers. + +If yes, add the JS method name to the relevant installed `decorateMethods(...)` +allowlist near the bottom of `index.ts`. Today those wrappers cover +`LoroDoc.prototype`, `EphemeralStoreWasm.prototype`, and `UndoManager.prototype`; +add another prototype only when the wrapper is wired there. Pure read/query APIs +should not be decorated. + +A quick behavioral check is to run with an active `doc.subscribe(...)` or +container `subscribe(...)` and confirm the mutation does not produce the internal +error above. Keep or add a regression test when the issue is observable from JS. + +## Packaging Rules + +- Preserve the public `loro-crdt` API names and package export paths used by + tests and docs. +- Do not hand-edit generated package output. Regenerate with `build-dev`, + `build-release`, or `pnpm release-wasm`. +- Package entrypoint changes must consider `bundler`, `browser`, `nodejs`, + `web`, and `base64` outputs. +- Vite and Webpack can emit the `.wasm` asset from `new URL(...)`; plain esbuild + and Rollup need either the `base64` entry or an explicit asset copy. Keep the + bundler smoke tests aligned with these expectations. +- If package output or published behavior changes, add a changeset. diff --git a/crates/loro-wasm/CLAUDE.md b/crates/loro-wasm/CLAUDE.md new file mode 120000 index 000000000..47dc3e3d8 --- /dev/null +++ b/crates/loro-wasm/CLAUDE.md @@ -0,0 +1 @@ +AGENTS.md \ No newline at end of file diff --git a/examples/bundler-smoke-tests/README.md b/examples/bundler-smoke-tests/README.md index 28f1562bb..c476812cc 100644 --- a/examples/bundler-smoke-tests/README.md +++ b/examples/bundler-smoke-tests/README.md @@ -29,7 +29,7 @@ pnpm --dir examples/bundler-smoke-tests run test:next ``` To also launch each production-built app in Chromium and verify `doc.toJSON()` -returns `{ t: "hi" }` in a real browser: +returns `{ map: { text: "mergeable-smoke" } }` in a real browser: ```sh pnpm --dir examples/bundler-smoke-tests run test:browser