Summary
split_graphemes() enters an infinite loop when the input string begins with a zero-width character (e.g. null byte \x00).
Minimal reproduction
from rich.cells import split_graphemes
split_graphemes("\x00") # hangs forever
Also hangs in any Textual app that renders such text:
from textual.app import App, ComposeResult
from textual.widgets import Static
from rich.text import Text
class TestApp(App):
def compose(self) -> ComposeResult:
yield Static(Text("\x00" + " " * 200))
TestApp().run() # hangs
Non-leading zero-width characters work fine:
split_graphemes("a\x00") # works
Cause
In cells.py, split_graphemes, the main loop handles characters in two branches:
if character_width := get_character_cell_size(character, unicode_version):
# width > 0: create a new span
spans.append((index, index := index + 1, character_width))
...
elif spans:
# width == 0 AND spans is non-empty: merge into previous span
start, _end, cell_length = spans[-1]
spans[-1] = (start, index := index + 1, cell_length)
When a zero-width character appears before any non-zero-width character:
character_width is 0 → the if branch is skipped
spans is empty → the elif branch is skipped
index is never incremented
- The loop processes the same character forever
Suggested fix
Handle leading zero-width characters by advancing index even when spans is empty:
if character_width := get_character_cell_size(character, unicode_version):
last_measured_character = character
spans.append((index, index := index + 1, character_width))
total_width += character_width
elif spans:
start, _end, cell_length = spans[-1]
spans[-1] = (start, index := index + 1, cell_length)
else:
index += 1 # skip leading zero-width characters
Environment
- Python 3.12
- macOS (Apple Silicon)
Summary
split_graphemes()enters an infinite loop when the input string begins with a zero-width character (e.g. null byte\x00).Minimal reproduction
Also hangs in any Textual app that renders such text:
Non-leading zero-width characters work fine:
Cause
In
cells.py,split_graphemes, the main loop handles characters in two branches:When a zero-width character appears before any non-zero-width character:
character_widthis 0 → theifbranch is skippedspansis empty → theelifbranch is skippedindexis never incrementedSuggested fix
Handle leading zero-width characters by advancing
indexeven whenspansis empty:Environment