Skip to content

Fix cross-platform test-suite failures on macOS/Linux#106

Merged
ggratte merged 4 commits intoItalyToast:masterfrom
ggratte:fix-cross-platform-tests
Apr 26, 2026
Merged

Fix cross-platform test-suite failures on macOS/Linux#106
ggratte merged 4 commits intoItalyToast:masterfrom
ggratte:fix-cross-platform-tests

Conversation

@ggratte
Copy link
Copy Markdown
Collaborator

@ggratte ggratte commented Apr 20, 2026

Summary

Four fixes that together let the .NET test suite build and run cleanly on macOS/Linux without touching sample-file contents. On Windows (CRLF checkout, system code page 1252) behavior is unchanged.

  • 680195bThreeStateParserTests line-ending split. GetTest split on the literal "\r\n", so on LF checkouts the blind- and showdown-action tests fed the parser one unsplittable blob. Now splits on the '\r' and '\n' characters, matching what HandHistoryParserFastImpl already does.
  • 1e8a08b — Encoding-aware sample-file reader. SampleHandHistoryRepositoryFileBasedImpl previously hard-coded Encoding.UTF8, so hand-history files produced by the Windows-only Winamax and MicroGaming clients (cp1252) were silently mangled. Prefer strict UTF-8, fall back to Windows-1252 on invalid-UTF-8 bytes. Fixtures stay byte-for-byte copies of real client output; future cp1252 captures no longer need manual conversion. Adds System.Text.Encoding.CodePages because .NET Core dropped Windows-1252 from the default encoding set.
  • 76fd549 — Re-encode HandHistories.Objects/GameDescription/Limit.cs as UTF-8 with BOM. The source file held three currency symbols as raw Windows-1252 bytes (0x80 €, 0xA3 £, 0xA5 ¥) with no BOM. Roslyn on Windows falls back to the system code page and compiles them correctly; on macOS/Linux the compiler treats the file as UTF-8, hits invalid start bytes, and replaces each with U+FFFD. The shipped assembly emits everywhere it should emit €/£/¥, which is the actual root cause of the limit / currency-symbol test failures on non-Windows hosts. No semantic change — the encoded code points (U+20AC, U+00A3, U+00A5) are what the tests and downstream consumers already expect.
  • ec99491 — JSON hand splitter LF tolerance. HandHistoryParserJSONImpl.SplitUpMultipleHands split on the literal "\r\n\r\n", so an LF-only multi-hand JSON dump (produced on macOS/Linux, or by any tool that doesn't emit CRLF) was treated as one giant hand. Relaxed to "\r?\n\r?\n", matching the fast-parser base class.

Test plan

  • dotnet test HandHistories.Parser.UnitTests on macOS (LF checkout, net8.0): Passed: 1021, Failed: 0, Skipped: 386.
  • Run the same on Windows (CRLF checkout) — expect no regressions. The Windows compiler already read Limit.cs correctly via the cp1252 fallback, so re-encoding to UTF-8 with BOM is a no-op for that toolchain.
  • Spot-check one Winamax and one MicroGaming sample still parses: the cp1252 bytes are unchanged on disk, the harness now decodes them correctly.

🤖 Generated with Claude Code

ggratte and others added 4 commits April 20, 2026 20:21
The old Split(new[] { "\r\n" }, ...) produced a single unsplit blob
on macOS/Linux checkouts where git normalizes line endings to LF,
so blind- and showdown-action tests fed the parser one giant line
and threw. Splitting on the individual '\r' and '\n' characters (as
HandHistoryParserFastImpl already does) handles both CRLF and LF
checkouts identically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test harness previously read every sample file with a hard-coded
Encoding.UTF8, so hands captured by the Winamax and MicroGaming clients
(Windows-only, cp1252) had byte 0x80 silently replaced with U+FFFD and
~20 tests failed on non-Windows hosts.

Prefer strict UTF-8 (catches BOM and correctly-encoded UTF-8 files,
including the two UTF-8-no-BOM PokerStars/Upoker samples with yuan and
other non-ASCII chars), fall back to Windows-1252 on invalid UTF-8.
Fixtures remain byte-for-byte copies of real client output, and new
cp1252 captures can be dropped in without silently mangling.

System.Text.Encoding.CodePages is required because .NET Core stripped
Windows-1252 from the default set of built-in encodings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The source file was written in Windows-1252: three currency-symbol
literals (EURO, GBP, CNY) were stored as raw high bytes 0x80, 0xA3 and
0xA5 without a BOM. Roslyn on Windows happens to fall back to the
system code page (cp1252) and compiles them correctly; on macOS and
Linux the compiler treats the file as UTF-8, hits invalid start bytes,
and replaces each one with U+FFFD. The resulting assembly emits "�"
everywhere it should emit "€", "£" or "¥", which breaks every limit /
currency-symbol test on non-Windows hosts.

Decode the existing bytes as cp1252 and re-save as UTF-8 with BOM so
every toolchain reads the same three code points regardless of the
host code page. No semantic change: U+20AC, U+00A3 and U+00A5 are the
characters the tests and downstream consumers already expect.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HandHistoryParserJSONImpl.SplitUpMultipleHands split on the literal
"\r\n\r\n", which means an LF-only multi-hand JSON file (produced on
macOS/Linux, or by a tool that doesn't emit CRLF) was treated as one
giant hand and failed to parse. Relax the pattern to "\r?\n\r?\n" so
both CRLF and LF separators work, matching what HandHistoryParserFastImpl
already does.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ggratte ggratte force-pushed the fix-cross-platform-tests branch from cf3bf8e to ec99491 Compare April 20, 2026 18:45
@ItalyToast
Copy link
Copy Markdown
Owner

lgtm

@ggratte ggratte merged commit 7e33136 into ItalyToast:master Apr 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants