fix(token): make GreenTokenData a DST to widen retag past slice tail#212
fix(token): make GreenTokenData a DST to widen retag past slice tail#212phall1 wants to merge 4 commits into
Conversation
Comprehensive fixes for Miri UB under both stacked and tree borrows: Arc: - clone/drop/is_unique: access refcount via ptr::addr_of! on raw pointer instead of through &ArcInner reference (avoids provenance narrowing) - ThinArc::clone/drop: operate on refcount directly via raw pointer instead of going through with_arc → transient Arc Green types: - GreenNodeData wraps fat Repr (unsized HeaderSlice<H, [T]>) so &GreenNodeData has provenance covering the full slice - GreenNode/GreenToken::deref transmute from fat refs (correct provenance) - GreenNode/GreenToken::into_raw extract pointer from ThinArc directly instead of going through Deref → &Data → NonNull::from - GreenTokenData::text() uses raw pointer arithmetic for slice access - thin_to_thick made pub(crate), HeaderSlice fields made pub(crate) Cursor: - Cell<NonNull<GreenNodeData>> accessed via as_ptr().read() instead of get() to preserve allocation provenance through the Cell Status: ALL tests pass under -Zmiri-tree-borrows (4/4 non-mutable tests). Under stacked borrows, only the mutable tree path (clone_for_update) still fails due to Cell provenance limitations inherent to that model. Upstream: rust-analyzer#192, rust-analyzer#163, rust-analyzer#108
Make elided lifetimes explicit in return types where &self borrows: - GreenChild::as_ref() -> GreenElementRef<'_> - NodeData::green_siblings() -> slice::Iter<'_, GreenChild> These warnings caused CI failure with RUSTFLAGS="-D warnings".
Stacks on top of rust-analyzer#211. After rust-analyzer#211, `GreenNodeData` is a DST (`HeaderSlice<H, [GreenChild]>`) and `GreenNode::deref` returns a fat reference whose retag covers the whole slice. `GreenTokenData` was left as `HeaderSlice<H, [u8; 0]>`, so `GreenToken::deref` still hands out a narrow `&GreenTokenData` whose SharedReadOnly tag covers only `sizeof::<ReprThin>()`. Reaching the slice tail through that reference (via `ptr::addr_of!(self.data.slice)` + `from_raw_parts`, as rust-analyzer#211 currently does) is rejected by Stacked Borrows because the zero-size retag has no permission past the thin extent. Mirrors what rust-analyzer#211 did for `GreenNodeData`: change `GreenTokenData`'s inner field to `Repr = HeaderSlice<H, [u8]>`, route `GreenToken::deref` through `&self.ptr` (fat) + transmute, and have `into_raw`/`from_raw` carry/strip the slice metadata via `thin_to_thick` (same shape as the existing `GreenNode` impls). `text()` collapses back to `from_utf8_unchecked(self.data.slice())`. Repro: hashbrown rehash inside `NodeCache::token` calls `token_hash` -> `GreenTokenData::text` from a `&NoHash<GreenToken>` that has been narrowed by hashbrown's internal retag. Before this patch, that hits SB UB at `src/green/token.rs:104`: error: Undefined Behavior: trying to retag from <N> for SharedReadOnly permission at allocM[0x18], but that tag does not exist in the borrow stack for this location --> src/green/token.rs:104:25 let bytes = std::slice::from_raw_parts(slice_start, len); Caught by cyrs CI: - https://github.com/phall1/cyrs/actions/runs/25638947905 (job 75256239232) - https://github.com/phall1/cyrs/actions/runs/25639698954 (job 75257867300) Regression test: `tests/miri_node_cache_rehash.rs` forces hashbrown to grow the token map (200 distinct tokens) and asserts no SB violation under `-Zmiri-strict-provenance`. Fails on PR rust-analyzer#211 alone, passes after this patch. Refs rust-analyzer#163, rust-analyzer#192, rust-analyzer#108. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#211) PR #211 alone doesn't fix the cyrs repro — promoted GreenNodeData to DST but left GreenTokenData narrow. Filed rust-analyzer/rowan#212 with the matching DST promotion for tokens (1 file, ~28 lines). Stacks on #211. Comments posted on PR #211 and issue #163. phall1/rowan:fix/sb-node-cache-rehash has both fixes stacked, ready for [patch.crates-io] use (cy-yrz). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Sorry for you investing time in this, but we'll rewrite rowan anyway as soon as we finish with the rust-analyzer assist migration, a rewrite that will make it much more performant and also likely to fix this issue (and even if not, the code this PR changes will be stale). So I'm inclined to close this PR, as well as #211. Also note that rowan should pass Tree Borrows, and that this specific rule from Stacked Borrows is unlikely to make it into the final aliasing model for Rust. |
|
no worries! I'm happy to close this. I should be following rust analyzer more closely -- Where's the best spot to subscribe if you don't mind pointing me in the right direction? |
|
#t-compiler/rust-analyzer > Future Rowan, rust-lang/rust-analyzer#15710, rust-lang/rust-analyzer#18285, we also have a GSoC project for that. |
…itten ChayimFriedman2 commented on rust-analyzer/rowan#212: rewrite coming after RA assist migration, inclined to close PRs #211 and our #212. Also noted rowan passes Tree Borrows; the SB rule we're hitting is unlikely to be in Rust's final aliasing model = our miri failures are academic, not real UB. New strategy: maintain phall1/rowan fork as patch source. No more upstream PRs. cy-208 (cursor::free SB) becomes a fork-only patch. Track the eventual upstream rewrite via the rust-analyzer Zulip 'Future Rowan' thread + issues rust-lang/rust-analyzer#15710 and #18285 + the GSoC project. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stacks on #211. tokens still need the same DST treatment you gave nodes there.
with #211 alone,
GreenToken::derefreturns a narrow&GreenTokenDatawhose retag covers onlysizeof::<ReprThin>().text()then widens past that withaddr_of!(self.data.slice) + from_raw_parts, which SB rejects:real-world path:
hashbrown::RawTablerehash inNodeCache::token->token_hash->text().caught in a cypher frontend project im doing.
what changes: same shape as your node fix in #211.
GreenTokenData::databecomesHeaderSlice<H, [u8]>,derefgoes through&self.ptr,into_raw/from_rawusethin_to_thick,text()collapses tofrom_utf8_unchecked(self.data.slice()). one file, ~28 lines.test:
tests/miri_node_cache_rehash.rsbuilds 200 distinct tokens to force a rehash. fails on #211 alone, passes here.cargo testandcargo +nightly miri test --test miri_node_cache_rehashboth green.one more SB hit out of scope here for the record:
ast::tests::ensure_mut_panic_on_create->ThinArc::clone->count_ptr.fetch_addderefs through narrow&GreenNodeDataand loses the count's provenance. separate fix.refs #163, #192, #108