refactor: reduce unsafe, fix Miri violations, and deduplicate code#219
refactor: reduce unsafe, fix Miri violations, and deduplicate code#219
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #219 +/- ##
==========================================
+ Coverage 69.97% 70.42% +0.45%
==========================================
Files 42 42
Lines 9917 9524 -393
==========================================
- Hits 6939 6707 -232
+ Misses 2978 2817 -161 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR refactors core parsing/value/serde code to reduce unsafe usage, address Miri/Stacked Borrows provenance issues, and deduplicate repeated parser/serializer logic across the crate.
Changes:
- Reduced
unsafesurface area and adjusted pointer provenance handling in DOM/value construction paths. - Deduplicated parser and serializer code via shared helpers/macros, consolidating checked/unchecked and owned/inplace variants.
- Replaced some manual pointer/slice manipulation with safe standard-library equivalents.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/value/visitor.rs | Adjusts visitor trait defaults and reserves key-visitor hook. |
| src/value/tls_buffer.rs | Refactors TLS buffer allocation to avoid raw-box pointer creation. |
| src/value/ser.rs | Deduplicates integer serializer forwarding; narrows unsafe use for floats. |
| src/value/partial_eq.rs | Deduplicates Value numeric/bool equality helpers via macro. |
| src/value/node.rs | Large refactor: Meta union→struct, narrower union-field unsafe, provenance fixes in DOM visitor/allocation flow. |
| src/value/from.rs | Uses the (now safe) new_f64_unchecked constructor. |
| src/value/array.rs | Replaces unsafe swaps/slice creation with safe slice APIs. |
| src/serde/ser.rs | Extracts repeated serialization patterns into macros; adds safety comments for UTF-8 unchecked conversions. |
| src/serde/de.rs | Adds fix_position() helper; exposes Arc provenance for pack_shared use. |
| src/reader.rs | Uses safe slicing for peek_n; adjusts provenance handling for padded reads. |
| src/parser.rs | Consolidates owned/inplace paths via dispatch_value, parse_string_visit, skip_one_checked, and shared checked/unchecked traversal helpers. |
| src/lazyvalue/owned.rs | Removes an unnecessary unsafe around as_faststr(). |
| src/lazyvalue/get.rs | Routes get APIs through unified get_from_with_iter(..., checked) entrypoint. |
| src/input.rs | Makes JsonSlice::as_faststr safe (noted issue: potential UB if bytes aren’t UTF-8). |
Comments suppressed due to low confidence (2)
src/input.rs:21
JsonSlice::as_faststris now a safe function but callsas_str(sub), which usesfrom_utf8_uncheckedinternally. SinceJsonSlice::Rawcan be constructed from an arbitrary&[u8](viaFrom<&[u8]>), callingas_faststron non‑UTF‑8 bytes can trigger UB. Either makeas_faststrunsafe again, or validate UTF‑8 (e.g., viastd::str::from_utf8) before constructing theFastStr, or change the API to return aResult/fallback behavior on invalid UTF‑8.
pub(crate) fn as_faststr(&self) -> FastStr {
match self {
JsonSlice::Raw(sub) => FastStr::new(as_str(sub)),
JsonSlice::FastStr(f) => f.clone(),
}
src/value/node.rs:972
new_f64_uncheckedwas changed fromunsafe fnto a safe function but still only guards finiteness withdebug_assert!. In release builds this permits constructingValuewith NaN/±Inf, which can break internal assumptions (e.g., laterNumber::try_from(f64).unwrap()paths) and makes the "unchecked" naming misleading. Consider either making this functionunsafeagain, or enforcing the invariant in all builds (e.g.,assert!(is_finite)/returningOption/usingnew_f64).
pub(crate) fn new_f64_unchecked(fval: f64) -> Self {
debug_assert!(fval.is_finite(), "f64 must be finite");
Value {
meta: Meta::new(Meta::F64),
data: Data { fval },
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
## Unsafe Reduction - Convert Meta from union to struct (eliminates 6 unsafe field accesses) - Narrow unsafe scopes in unpack_ref/Drop/as_mut to per-arm union access - Replace NonNull::new_unchecked with safe NonNull::new().expect() - Replace unsafe ptr::swap with safe slice::swap in Array::swap_remove - Replace unsafe from_raw_parts with safe slice indexing in reader/array - Use safe NonNull construction in TlsBuf ## Miri Stacked Borrows Fixes - Change DocumentVisitor.shared from &mut Shared to *mut Shared to prevent tag invalidation across mutable reborrows - Use integer arithmetic in forward_find_shared to avoid narrow provenance when navigating to MetaNode in bump allocations - Expose Arc allocation provenance for pack_shared's Arc::increment_strong_count - Use ptr::copy_nonoverlapping in visit_root to preserve pointer provenance in Data union - Fix PaddedSliceRead::new provenance (as_mut_ptr vs &mut [0]) ## Parser Deduplication (-232 lines) - Merge 10 checked/unchecked function pairs via bool parameter: skip_one, get_from_object, get_from_array, get_from_with_iter, get_many_keys, get_many_index - Merge inplace/owned parsing pairs via Option<&mut Vec<u8>>: parse_string_visit, parse_array, parse_object, parse_dom, dispatch_value - Unify get_escaped_branchless_u32/u64 via macro - Remove wrapper functions, clean up _inner naming ## Serializer/Deserializer Deduplication (-188 lines) - Extract 21 identical RawValue error methods into reject_raw_value! macro - Extract 10+10+6 integer serialize methods into impl_serialize_int!/ impl_serialize_key_int!/forward_sorted_key! macros - Extract 12 fix_position error wrappers into helper method - Extract serialize_f32/f64 key into impl_serialize_float_key! macro - Deduplicate eq_* helpers in partial_eq.rs via impl_eq_fn! macro - Deduplicate serialize forwarding in value/ser.rs via forward_to! macro ## Other Cleanup - Remove incorrect #[allow(dead_code)] from JsonVisitor trait methods - Remove dead parse_value/skip_string_unchecked2 functions All changes verified: 146 tests pass across default/sort_keys/ arbitrary_precision features, clippy zero warnings, Miri 98/99 pass (1 skipped due to large file timeout), benchmark shows no regression (most cases 1-5% faster). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
59d992a to
fce4c3a
Compare
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
Codex Review: Didn't find any major issues. Hooray! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Change Meta from a plain u64 struct to a union of u64 and *const Shared. This preserves pointer provenance through storage for root_node variants, making pack_shared/unpack_root strict-provenance compatible without needing expose/recover roundtrips. - Meta union stores tagged pointer via ptr field (root_node) or integer via val field (all other kinds) - Replace implicit `as usize` casts with explicit expose_provenance() / with_exposed_provenance() for remaining exposed provenance sites - Remove redundant double expose/recover in unpack_shared (MetaNode already stores pointer with full provenance) Miri results: - Default (permissive provenance): all tests pass - Strict provenance: all tests pass except those exercising forward_find_shared (bump allocation navigation requires exposed provenance by design) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2efba2da93
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
On 32-bit targets (e.g. wasm32-wasip1), *const Shared is 4 bytes but u64 is 8 bytes. Writing through the union ptr field would only initialize the lower 4 bytes, leaving the upper half uninitialized. Subsequent reads through read_val() would then be UB. Fix: use cfg(target_pointer_width = "64") to gate the union approach. On 32-bit targets, Meta falls back to a plain u64 struct with exposed provenance for the root_node round-trip. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace self-hosted runner with ubuntu-latest to avoid queue delays when no self-hosted runner is available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
*mut Shared, exposed provenance for Arc/bump allocations, provenance-preserving copiesreject_raw_value!,impl_serialize_int!, etc.)Changes by Area
value/node.rsunpack_ref/Drop, Miri provenance fixes inDocumentVisitor/forward_find_shared/visit_rootparser.rsget_escaped_branchlessvia macroserde/ser.rsreject_raw_value!,impl_serialize_int!,impl_serialize_key_int!,forward_sorted_key!,impl_serialize_float_key!macrosserde/de.rsfix_position()helper replacing 12 identical match blocksvalue/array.rsslice::swapreplacing unsafeptr::swapreader.rsPaddedSliceRead::newTest plan
cargo test— 146 passed (default, sort_keys, arbitrary_precision)cargo clippy --all-targets --all-features— zero warnings🤖 Generated with Claude Code