fix: two O(n²) editing slowdowns (UTF-16 boundary check + event compose) by zxch3n · Pull Request #1019 · loro-dev/loro

zxch3n · 2026-06-16T07:04:36Z

Summary

Fixes two independent O(n²) slowdowns in editing. Both surfaced from the same investigation; the first matches the user-visible "JS text insert is quadratic" report, the second matches the "slow when a subscriber is attached" framing.

Bug 1 — UTF-16 / byte boundary check materialized the whole prefix (regression since 1.12.0)

1.12.0 (#954) added validate_text_boundary() to the text edit hot paths. For UTF-16/byte positions it round-trips the position through convert_pos, whose to-UTF-16/bytes branch did:

let prefix = state.get_text_slice_by_event_index(0, event_index)?; // whole [0,pos) string
count_utf16_len(prefix.as_bytes())

— O(pos) per edit → O(n²) for a run of edits. The JS binding always uses UTF-16 indices, so it was hit hardest. Fix: map the position via a single cursor query reading the rope's prefix caches (event_index_to_index → get_index_from_cursor), O(log n). Unicode-indexed editing was never affected (it early-returns before convert_pos).

Bug 2 — event building cloned the growing accumulated diff (subscriber-gated)

When a subscriber is attached, diffs_to_event composes all diff fragments targeting the same container in one batch via last_container_diff.clone().compose(...) — cloning the growing accumulator on every fragment → O(K²) for K fragments. Triggered by random-position text inserts (whose event hints don't merge) and by many distinct map-key writes. Fix: compose in place (mem::replace), make ResolvedMapDelta::compose consume self, and drop the needless accumulator clones in InternalDiff::compose / Diff::compose_ref. Affects text, map, and list events.

Benchmarks (random single-char edits)

scenario	before	after	scaling before→after
JS `loro-crdt@1.13.3`, utf16 insert, n=48k	161284 ms	482 ms (local build)	O(n²) → O(n)
Rust utf16 / utf8 / delete-utf16 insert	quadratic	flat per-op	O(n²) → O(n)
Rust text insert + subscriber, n=48k	51149 ms	269 ms	O(n²) → O(n)
Rust map distinct-key + subscriber, n=48k	47248 ms	112 ms	O(n²) → O(n)

Verification

New tests: convert_pos_matches_prefix_reference_all_coords (every coord pair vs prefix reference + mid-char rejection), batched_same_container_edits_emit_correct_event (composed event content equals final doc), perf_text_insert_utf16_is_linear (#[ignore] perf guard, mirrors the existing perf_import_quadratic convention).
All loro text + contract tests, loro-internal lib tests (335), wasm-feature richtext tests, and undo tests pass. Clippy clean.

Related finding (not in this PR)

A wider audit found a third, confirmed O(n²): walking a tree backward through history (doc.checkout over many versions) is quadratic because the tree diff-calculator's retreat scans every node per step (crates/loro-internal/src/diff_calc/tree.rs:177), while the forward path uses a bounded lamport range. Measured x3.8–4.7 per doubling for tree, vs ~x2 for an equivalent text checkout walk (confirming it's the tree calculator, not general checkout). It is a distinct, correctness-sensitive diff-calc change and is left for a dedicated follow-up PR with fuzzing rather than bundled here.

🤖 Generated with Claude Code

Since 1.12.0 (#954), `validate_text_boundary` was added to the text insert/delete/splice/mark hot paths. For UTF-16 and byte positions it round-trips the position through `convert_pos`, whose to-UTF-16/bytes branch materialized the entire `[0, pos)` prefix string via `get_text_slice_by_event_index(0, pos)` and counted its length. That made every edit O(n) and a run of edits O(n^2). The JS binding always uses UTF-16 coordinates, so it was hit hardest. Replace the prefix materialization with `event_index_to_index`, which maps an event index onto the target coordinate by reading the rope's prefix caches from a single cursor query (O(log n)). Behavior is unchanged: invalid boundaries (mid-surrogate / mid-codepoint) are still rejected by the round-trip comparison in `validate_text_boundary`. Benchmarks (random single-char inserts, n=48000): - JS loro-crdt@1.13.3 -> local fix: 161284ms -> 482ms (~334x) - Rust insert_utf16/insert_utf8/delete_utf16: now linear (flat per-op) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-16T07:09:19Z

WASM Size Report

Original size: 3031.65 KB
Gzipped size: 1000.11 KB
Brotli size: 702.39 KB

When a subscriber is attached, `diffs_to_event` composes all diff fragments that target the same container within one event batch. It did `last_container_diff.clone().compose(...)`, cloning the *growing* accumulated diff on every fragment — O(K^2) for K fragments. This is hit by random-position text inserts (whose event hints don't merge) and by many distinct map-key writes, etc. - Compose in place via `mem::replace` instead of cloning the accumulator. - `ResolvedMapDelta::compose` now consumes `self` instead of cloning `self.updated` (the map event path). - `InternalDiff::compose` no longer clones the accumulated RichtextRaw delta; `Diff::compose_ref` (undo path) no longer clones the Map/Tree accumulator. Benchmarks (random single-char inserts, subscriber attached, n=48000): - text: 51149ms -> 269ms (~190x), now linear - map distinct-key writes: 47248ms -> 112ms (~420x), now linear Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Converting a UTF-16 or UTF-8 (byte) offset to a unicode offset within a text chunk scanned the chunk char-by-char (`utf16_to_unicode_index` / `utf8_to_unicode_index` and their inverses). The b-tree locates the leaf in O(log n), but the within-leaf scan is O(offset-in-chunk). Since the state rope does not bound chunk size — a large insert, a loaded document, or a run of sequentially-typed text that merges into one chunk all produce big chunks — every UTF-16/UTF-8 position op (insert, delete, slice, splice, mark, convert_pos) was O(chunk length), so a sequence of them on a large chunk was O(n^2). A chunk whose `utf16_len == unicode_len` contains no astral-plane chars, so UTF-16 offsets equal unicode offsets; a chunk whose byte length equals `unicode_len` is pure ASCII, so byte offsets equal unicode offsets. In both cases the conversion is O(1). This covers essentially all real-world text (ASCII / Latin / CJK and the JS UTF-16 path). Chunks that actually contain astral chars (e.g. a huge all-emoji block) still scan, but that is rare and unchanged in correctness. Benchmarks (n=32000): - random slice_utf16(p,p+1) on a big ASCII chunk: O(n^2) -> O(n) (flat ~0.46us/op) - sequential utf16 append (typing): superlinear -> O(n) (flat ~3.2us/op) Validated by the loro/loro-internal test suites (default + wasm feature) and the `all` cargo-fuzz target over the shared corpus. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- event_index_to_index: drop the unreachable PosType::Event guard; both callers only pass Bytes/Utf16, and get_index_from_cursor already handles Event identically. - TextChunk::_slice: reuse the new is_all_ascii() helper instead of an inline duplicate of the same predicate. - diffs_to_event: use mem::take (DiffVariant already derives Default with None) instead of mem::replace(.., DiffVariant::None). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

zxch3n changed the title ~~fix: O(n²) text editing with UTF-16/byte positions~~ fix: two O(n²) editing slowdowns (UTF-16 boundary check + event compose) Jun 16, 2026

zxch3n and others added 3 commits June 16, 2026 08:22

style: rustfmt the changed regions

bf588ec

zxch3n merged commit 4d577ad into main Jun 16, 2026
1 check passed

zxch3n deleted the feat/loro-text-insert-perf branch June 16, 2026 12:21

zxch3n mentioned this pull request Jun 16, 2026

fix: recover two per-op editing slowdowns regressed since 1.11 #1021

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: two O(n²) editing slowdowns (UTF-16 boundary check + event compose)#1019

fix: two O(n²) editing slowdowns (UTF-16 boundary check + event compose)#1019
zxch3n merged 5 commits into
mainfrom
feat/loro-text-insert-perf

zxch3n commented Jun 16, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

zxch3n commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug 1 — UTF-16 / byte boundary check materialized the whole prefix (regression since 1.12.0)

Bug 2 — event building cloned the growing accumulated diff (subscriber-gated)

Benchmarks (random single-char edits)

Verification

Related finding (not in this PR)

Uh oh!

github-actions Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

WASM Size Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zxch3n commented Jun 16, 2026 •

edited

Loading

github-actions Bot commented Jun 16, 2026 •

edited

Loading