Skip to content

[pull] master from BYVoid:master#152

Merged
pull[bot] merged 3 commits into
backup999:masterfrom
BYVoid:master
Jun 11, 2026
Merged

[pull] master from BYVoid:master#152
pull[bot] merged 3 commits into
backup999:masterfrom
BYVoid:master

Conversation

@pull

@pull pull Bot commented Jun 11, 2026

Copy link
Copy Markdown

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

frankslin and others added 3 commits June 10, 2026 12:04
…inline format (#1300)

This script merges configuration files and text dictionaries on-the-fly. The output is formatted as JSONC (with header comments for version, compile time and source info), which is slightly different from the original pure JSON files but fully supported by OpenCC.

The purpose of the output of this script is not to replace the config format; rather, as a way to compare changes to dictionary files.
…oss all test suites (#1301)

As the test suite grows, test case IDs alone are often insufficient to convey the intent or context behind individual cases. This change adds JSONC support across all test suite parsers (C++, Node.js, and Python) so that contributors can annotate test cases inline — for example, explaining why a particular input/output pair exists, or flagging non-obvious edge cases.

Trailing comma support is included as a minor convenience: when appending a new entry to the cases array, the last existing entry does not need to be modified just to add a comma, keeping diffs minimal and focused.

This is a non-breaking, infrastructure-only change. No existing test case data is modified.
* Add Taiwan medical phrase conversions (s2twp / tw2sp)

Add cross-strait medical vocabulary differences that character-level
conversion cannot handle, so s2twp lands on Taiwan usage and tw2sp converts
back. Covers blood tests, hepatitis A-E, neurology/psychiatry, cardiology,
imaging, drugs, and syndrome terms.

- TWPhrases.txt: 70 forward entries (keys are post-s2t standard traditional
  forms, e.g. 白細胞->白血球, 乙肝->B肝, 阻滯劑->阻斷劑, 他汀類->史他汀類,
  代謝綜合徵->代謝症候群)
- TWPhrasesRev.txt: 61 reverse entries
- STPhrases.txt: 19 whole-word segmentation entries so compound terms
  (综合征-suffixed syndromes, 计算机断层) survive segmentation before the
  Taiwan vocabulary stage
- testcases.json: 10 consolidated s2twp / tw2sp cases

Conventions:
- Abbreviation<->abbreviation, full<->full (乙肝<->B肝, 乙型肝炎<->B型肝炎).
- tw2sp keeps the full form rather than emitting an abbreviation (心房顫動
  stays, not 房顫) via self-mappings; 心肌梗塞 reverses to the common mainland
  心肌梗死.
- Ultrasound: 超聲波/B超 -> 超音波; tw2sp 超音波 -> 超声波 (the general term),
  avoiding over-conversion of 超音波清洗機 etc.
- Multiple Taiwan variants are accepted on tw2sp where both are in use
  (妥瑞氏症/妥瑞症, 馬凡氏症候群/馬凡氏症, 阿莫西林/安莫西林).
- Pharmacology terms scoped to category-level (阻滯劑, 他汀類) to avoid
  corrupting individual drug names.

Reverse mapping and Taiwan phrase segmentation invariants verified.
@pull pull Bot locked and limited conversation to collaborators Jun 11, 2026
@pull pull Bot added the ⤵️ pull label Jun 11, 2026
@pull pull Bot merged commit da76315 into backup999:master Jun 11, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants