Fix table alignment with Unicode characters by adding proper width ca…#3686
Closed
shyam-ramani wants to merge 1 commit intoTextualize:masterfrom
shyam-ramani:fix/unicode-table-alignment
Closed
Fix table alignment with Unicode characters by adding proper width ca…#3686shyam-ramani wants to merge 1 commit intoTextualize:masterfrom shyam-ramani:fix/unicode-table-alignment
shyam-ramani wants to merge 1 commit intoTextualize:masterfrom
shyam-ramani:fix/unicode-table-alignment
Conversation
Member
|
Rich already handles double cell characters. Some emoji are never going to work because terminals render them at different widths, no matter what the unicodedata says. If you are using an LLM to write this code, you should know that they tend to produce garbage. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix Table Alignment with Unicode Characters
Issue Description
Currently, the Rich library's table rendering doesn't properly handle the visual width of Unicode characters, causing misalignment in tables containing mixed content (ASCII, Unicode, emojis, etc.). This is particularly noticeable when displaying:
Solution
Implemented a new
get_unicode_widthfunction inrich/text.pythat properly calculates the visual width of Unicode characters based on their properties. The function:unicodedata.east_asian_width()to determine character widthImplementation Details
Added
get_unicode_widthfunction torich/text.py:Modified
Table.add_rowinrich/table.pyto use the new width calculation:Test Cases
The fix has been tested with various Unicode content:
Result:
┏━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━┓
┃ English ┃ Japanese ┃ Emoji ┃
┡━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━┩
│ Hello │ こんにちは │ 👋 │
│ World │ 世界 │ 🌍 │
│ → Arrow │ → 矢印 │ ➡️ │
│ ★ Star │ ★ 星 │ ⭐ │
└─────────┴────────────┴───────┘
Impact and Considerations
Positive Impact:
Performance:
unicodedatamodule for efficient character property lookupBackward Compatibility:
Additional Notes