Fix cross-platform test-suite failures on macOS/Linux#106
Merged
ggratte merged 4 commits intoItalyToast:masterfrom Apr 26, 2026
Merged
Fix cross-platform test-suite failures on macOS/Linux#106ggratte merged 4 commits intoItalyToast:masterfrom
ggratte merged 4 commits intoItalyToast:masterfrom
Conversation
The old Split(new[] { "\r\n" }, ...) produced a single unsplit blob
on macOS/Linux checkouts where git normalizes line endings to LF,
so blind- and showdown-action tests fed the parser one giant line
and threw. Splitting on the individual '\r' and '\n' characters (as
HandHistoryParserFastImpl already does) handles both CRLF and LF
checkouts identically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test harness previously read every sample file with a hard-coded Encoding.UTF8, so hands captured by the Winamax and MicroGaming clients (Windows-only, cp1252) had byte 0x80 silently replaced with U+FFFD and ~20 tests failed on non-Windows hosts. Prefer strict UTF-8 (catches BOM and correctly-encoded UTF-8 files, including the two UTF-8-no-BOM PokerStars/Upoker samples with yuan and other non-ASCII chars), fall back to Windows-1252 on invalid UTF-8. Fixtures remain byte-for-byte copies of real client output, and new cp1252 captures can be dropped in without silently mangling. System.Text.Encoding.CodePages is required because .NET Core stripped Windows-1252 from the default set of built-in encodings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The source file was written in Windows-1252: three currency-symbol literals (EURO, GBP, CNY) were stored as raw high bytes 0x80, 0xA3 and 0xA5 without a BOM. Roslyn on Windows happens to fall back to the system code page (cp1252) and compiles them correctly; on macOS and Linux the compiler treats the file as UTF-8, hits invalid start bytes, and replaces each one with U+FFFD. The resulting assembly emits "�" everywhere it should emit "€", "£" or "¥", which breaks every limit / currency-symbol test on non-Windows hosts. Decode the existing bytes as cp1252 and re-save as UTF-8 with BOM so every toolchain reads the same three code points regardless of the host code page. No semantic change: U+20AC, U+00A3 and U+00A5 are the characters the tests and downstream consumers already expect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HandHistoryParserJSONImpl.SplitUpMultipleHands split on the literal "\r\n\r\n", which means an LF-only multi-hand JSON file (produced on macOS/Linux, or by a tool that doesn't emit CRLF) was treated as one giant hand and failed to parse. Relax the pattern to "\r?\n\r?\n" so both CRLF and LF separators work, matching what HandHistoryParserFastImpl already does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cf3bf8e to
ec99491
Compare
Owner
|
lgtm |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four fixes that together let the .NET test suite build and run cleanly on macOS/Linux without touching sample-file contents. On Windows (CRLF checkout, system code page 1252) behavior is unchanged.
680195b—ThreeStateParserTestsline-ending split.GetTestsplit on the literal"\r\n", so on LF checkouts the blind- and showdown-action tests fed the parser one unsplittable blob. Now splits on the'\r'and'\n'characters, matching whatHandHistoryParserFastImplalready does.1e8a08b— Encoding-aware sample-file reader.SampleHandHistoryRepositoryFileBasedImplpreviously hard-codedEncoding.UTF8, so hand-history files produced by the Windows-only Winamax and MicroGaming clients (cp1252) were silently mangled. Prefer strict UTF-8, fall back to Windows-1252 on invalid-UTF-8 bytes. Fixtures stay byte-for-byte copies of real client output; future cp1252 captures no longer need manual conversion. AddsSystem.Text.Encoding.CodePagesbecause .NET Core dropped Windows-1252 from the default encoding set.76fd549— Re-encodeHandHistories.Objects/GameDescription/Limit.csas UTF-8 with BOM. The source file held three currency symbols as raw Windows-1252 bytes (0x80€,0xA3£,0xA5¥) with no BOM. Roslyn on Windows falls back to the system code page and compiles them correctly; on macOS/Linux the compiler treats the file as UTF-8, hits invalid start bytes, and replaces each withU+FFFD. The shipped assembly emits�everywhere it should emit€/£/¥, which is the actual root cause of the limit / currency-symbol test failures on non-Windows hosts. No semantic change — the encoded code points (U+20AC, U+00A3, U+00A5) are what the tests and downstream consumers already expect.ec99491— JSON hand splitter LF tolerance.HandHistoryParserJSONImpl.SplitUpMultipleHandssplit on the literal"\r\n\r\n", so an LF-only multi-hand JSON dump (produced on macOS/Linux, or by any tool that doesn't emit CRLF) was treated as one giant hand. Relaxed to"\r?\n\r?\n", matching the fast-parser base class.Test plan
dotnet test HandHistories.Parser.UnitTestson macOS (LF checkout, net8.0): Passed: 1021, Failed: 0, Skipped: 386.Limit.cscorrectly via the cp1252 fallback, so re-encoding to UTF-8 with BOM is a no-op for that toolchain.🤖 Generated with Claude Code