Add lossless Decimal encoding and decoding#20
Conversation
Resolves #19. Read numbers as raw text so Decimal and large integers decode with full precision instead of going through Double.
There was a problem hiding this comment.
Pull request overview
This PR adds lossless Decimal support by preserving JSON number text during decoding (avoiding intermediate Double conversion) and by encoding Decimal values directly as JSON numbers rather than using Decimal’s default keyed-container encoding.
Changes:
- Add
Decimal-specific encoding paths (top-level + keyed/unkeyed/single-value containers) that emit raw JSON numeric text. - Update decoder to read numbers as raw text (
.numberAsRaw) and addDecimaldecoding that parses from the preserved numeric text. - Add extensive
Decimalencoding/decoding test coverage, including precision and boundary/overflow scenarios.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| Tests/YYJSONTests/EncoderTests.swift | Adds YYJSONEncoder decimal encoding and round-trip precision tests. |
| Tests/YYJSONTests/DecoderTests.swift | Adds YYJSONDecoder decimal decoding tests (including precision, exponents, and error cases). |
| Sources/YYJSON/Encoder.swift | Implements Decimal encoding as raw JSON numbers (including top-level interception). |
| Sources/YYJSON/Decoder.swift | Enables raw-number reading and implements Decimal parsing from raw numeric text; updates numeric decoding paths accordingly. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
This branch enables Benchmarking the performance cost:
The two double-heavy workloads (synthetic 10k and real-world 200k) agree on ~45 ns per double. Ints are cheaper at ~15 ns because vs. Foundation (absolute, after the change)
Let's see if we can bring this down by using the fast-path when precision isn't needed. Footnotes |
Replace the `String(decoding:as:) + Swift.Double(_:)` text round-trip in the number-decoding helpers with direct `strtod`/`strtoll`/`strtoull` calls on yyjson's null-terminated raw buffer. Cuts per-number Codable overhead from ~47 ns to ~9 ns for doubles and from ~15 ns to ~12 ns for ints, recovering most of the wall-clock cost introduced by `YYJSON_READ_NUMBER_AS_RAW`. On a 200k-double real-world payload the regression vs the native-getter path drops from +21% to +7%. `yyNumberText` is retained for `Decimal` decoding and for the number-to-string coercion path, which both genuinely need a `String`.
Introduce `YYJSONDecoder.NumberDecodingStrategy` with two cases: - `.lossless` (default) preserves the original input text of every number via `YYJSON_READ_NUMBER_AS_RAW`, keeping `Decimal` and large-integer decoding exact. This matches `JSONDecoder`'s precision contract. - `.fast` skips the raw-number flag and lets yyjson parse numbers as native `Int64`/`UInt64`/`Double`, restoring the library's pre-fix throughput at the cost of fractional `Decimal` precision and arbitrary integer range. The existing extraction helpers (`yyParseDouble`, `yyParseSignedInt`, `yyParseUnsignedInt`, `yyNumberText`) already branch on whether the value is stored as raw text or as a parsed number, so no further plumbing is required: the strategy simply toggles whether the raw flag is set. `.fast` recovers within ~2% of pre-fix throughput on number-heavy payloads (10k double array: 2179 µs lossless → 2024 µs fast; 200k double GeoJSON coordinate decode: 43.8 ms lossless → 41.0 ms fast).
|
Alright, I think we can have our cake and eat it too! With these changes, users now get correct behavior by default at a 6–10% penalty. Users with number-heavy workloads have a one-line opt-out ( Working through Copilot feedback and updating the README. |
Range-check Double fallbacks with `T(exactly: d.rounded(.towardZero))` instead of comparing against `Double(T.min)`/`Double(T.max)`, which round for 64-bit bounds and could admit out-of-range values that then trap on `T(d)`. Switch `strtoll`/`strtoull` to base 0 so JSON5 hex literals like `0xFF`, preserved as raw text under `YYJSON_READ_NUMBER_AS_RAW`, decode as integers instead of failing.
`Decimal(string:)` uses the host's current locale by default, which can mis-parse JSON numbers under locales whose decimal separator is `,`. Pin every JSON-text-to-`Decimal` conversion (decoder, value accessor, serialization) to `en_US_POSIX` so parsing matches JSON's locale- independent `.` separator.
`Decimal.description` uses the host's current locale and can emit `,` as the decimal separator, producing invalid JSON in locales like de_DE. Render through `NSDecimalNumber.description(withLocale:)` pinned to `en_US_POSIX` so the encoded number always uses `.`.
`Decimal(string:)` returns an optional, but the assertions compared the non-optional decoded value against it via implicit promotion, which silently passes if construction ever returns nil. Force-unwrap the expected value so a nil construction now fails the test with a clear trap instead of masking the regression.
The non-raw fallback used `yyjson_val_write` to format parsed numbers, which broke linking under the `noWriter` trait. Format the value through the typed getters (`yyjson_get_sint`/`yyjson_get_uint`/`yyjson_get_real`) and Swift's locale-independent `String` initializers, which already produce shortest round-trippable representations. This also removes the `malloc`/`free` from the hot path.
Three near-identical `decodeDecimal(from:path:)` methods lived on the keyed, unkeyed, and single-value containers, each routing numeric values through `yyNumberText` and `Decimal(string:locale:)` with the same error text. Extract a file-scope `yyDecodeDecimal` (and a shared `yyTypeString` for diagnostics) and have every container delegate to it, eliminating the duplication so future tweaks land in one place.
Under `.lossless` decoding (or `YYJSONSerialization`, which always preserves number text), JSON5 hex literals like `0xFF` arrive as raw text. The existing raw-number parsers handled decimal-only forms, so hex (and `Infinity`/`NaN`) decoded into `Double`/`Foundation` returned nil and silently became a type mismatch or `NSNull`. - `yyParseDouble` now tries `strtoll`/`strtoull` with base 0 first so hex integers convert to `Double`; `strtod` continues to handle the fractional, exponential, and non-finite forms. - `YYJSONValue.number` routes its `.numberRaw` case through `yyParseDouble`, picking up hex and non-finite spellings. - `YYJSONSerialization` adds `yyParseStrictInteger` (whole-text `strtoll`/`strtoull` with base 0) for the `NSNumber` integer paths so hex maps to `NSNumber(Int)` while fractional text still falls through to `Decimal`/`Double` instead of being truncated to `0`.
Several Decimal precision tests interpolated `\(decimal)` (or compared against `Decimal.description`) when building or asserting on JSON, both of which use the host's current locale and emit `,` as the decimal separator on locales like de_DE. That produces invalid JSON / mismatched expectations and made these tests locale-flaky. Render Decimals through `NSDecimalNumber(decimal:).description( withLocale:)` pinned to `en_US_POSIX` in the decoder, encoder, value, and serialization precision tests so the generated text always uses `.`, matching JSON's locale-independent format and the encoder's own POSIX output.
The original test iterated 0.00 → 99.99 by 0.01, performing ~10000 encode+decode roundtrips per run. Replace the sweep with a small curated sample (zero, fractional, signed, and high-precision boundary values) that still exercises the precision guarantee without paying the 10k iteration cost on every CI run.
The doc claimed integers outside the `Int64`/`UInt64` range "fail to decode, even into `Decimal`", but `.fast` actually parses them through `Double` and decodes into `Decimal` with `Double` precision rather than throwing. Rewrite the bullet to describe the precision loss so callers aren't surprised when an oversized integer silently rounds instead of raising an error.
| import Cyyjson | ||
| import Foundation | ||
|
|
||
| #if !YYJSON_DISABLE_READER | ||
|
|
||
| // MARK: - Helper Functions | ||
|
|
||
| /// Locale used to parse JSON numbers into `Decimal`. JSON numbers always use | ||
| /// `.` as the decimal separator regardless of the host's user locale, so we | ||
| /// pin parsing to POSIX to avoid mis-decoding under locales that use `,`. | ||
| private let yyPOSIXLocale = Locale(identifier: "en_US_POSIX") | ||
|
|
| func yyDecodeDecimal(from value: UnsafeMutablePointer<yyjson_val>?, path: String) throws -> Decimal { | ||
| guard let value = value else { | ||
| throw YYJSONError.missingValue(path: path) | ||
| } | ||
| guard yyIsNumeric(value) else { | ||
| throw YYJSONError.typeMismatch( | ||
| expected: "number", | ||
| actual: yyTypeString(value), | ||
| path: path | ||
| ) | ||
| } | ||
| guard let string = yyNumberText(value), | ||
| let decimal = Decimal(string: string, locale: yyPOSIXLocale) | ||
| else { | ||
| throw YYJSONError.invalidData( | ||
| "Could not parse number as Decimal", | ||
| path: path | ||
| ) | ||
| } | ||
| return decimal | ||
| } |
| import Cyyjson | ||
| import Foundation | ||
|
|
||
| /// Locale used to parse JSON numbers into `Decimal`. JSON numbers always use | ||
| /// `.` as the decimal separator regardless of the host's user locale, so we | ||
| /// pin parsing to POSIX to avoid mis-decoding under locales that use `,`. | ||
| private let yyPOSIXLocale = Locale(identifier: "en_US_POSIX") | ||
|
|
||
| #if !YYJSON_DISABLE_READER | ||
|
|
||
| /// Parses a JSON numeric literal as a fixed-width integer. | ||
| /// | ||
| /// Accepts plain decimal integers (including a leading sign) and the | ||
| /// JSON5 hex spellings (`0xFF`, `-0X10`, `+0x2A`) preserved as raw text | ||
| /// under `YYJSON_READ_NUMBER_AS_RAW`. Returns `nil` for fractional or | ||
| /// exponential text so callers can fall through to `Decimal`/`Double` | ||
| /// instead of silently truncating through `Double`. | ||
| @inline(__always) | ||
| fileprivate func yyParseStrictInteger<T: FixedWidthInteger>(_ text: String) -> T? { | ||
| return text.withCString { ptr -> T? in | ||
| let len = strlen(ptr) | ||
| guard len > 0 else { return nil } | ||
| var end: UnsafeMutablePointer<CChar>? | ||
| errno = 0 | ||
| if T.isSigned { | ||
| let v = strtoll(ptr, &end, 0) | ||
| guard errno == 0, let e = end, ptr.distance(to: UnsafePointer(e)) == Int(len) | ||
| else { return nil } | ||
| return T(exactly: v) | ||
| } | ||
| if ptr.pointee == 0x2D /* '-' */ { return nil } | ||
| let v = strtoull(ptr, &end, 0) | ||
| guard errno == 0, let e = end, ptr.distance(to: UnsafePointer(e)) == Int(len) | ||
| else { return nil } | ||
| return T(exactly: v) | ||
| } | ||
| } |
Resolves #19.
This PR updates decoder logic to read numbers as raw text so
Decimaland large integers decode with full precision instead of going throughDouble.