Fix Unicode tab-completion hang for multibyte characters (closes #2)#13
Closed
Kunal-Darekar wants to merge 1 commit intomgubi:mainfrom
Closed
Fix Unicode tab-completion hang for multibyte characters (closes #2)#13Kunal-Darekar wants to merge 1 commit intomgubi:mainfrom
Kunal-Darekar wants to merge 1 commit intomgubi:mainfrom
Conversation
- Snap TeXmacs byte cursor to a valid character boundary using thisind() before calling completions(), preventing an infinite hang on multibyte Unicode symbols like ρ (2 bytes in UTF-8) - Replace broken byte-arithmetic slicing (range.stop+2-range.start) with ncodeunits(prefix) for correct multibyte Unicode handling - Add early return when completions() returns no results
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Pressing Tab to autocomplete a Unicode variable like
ρhangs the session permanently (shows "busy", no further cells can be evaluated).Root cause: TeXmacs sends the cursor as a byte offset. For
ρ(U+03C1), that offset is2— which lands inside the middle of the two-byte UTF-8 sequence, not on a valid character boundary. Passing that raw offset tocompletions()causes Julia's internal word-boundary scan to loop indefinitely (older Julia) or throw aStringIndexError(Julia 1.11+).A second bug existed in the completion suffix slicing:
range.stop+2-range.startis byte arithmetic that gives the wrong slice index whenever the already-typed prefix itself contains multibyte characters.Fix
thisind(str, cursor)to snap the incoming byte offset to the nearest valid character boundary before callingcompletions()range.stop+2-range.startslice arithmetic withncodeunits(prefix)+1, which is correct for any Unicode prefixVerified
Added
test/test_issue2_unicode_completion.jl— 8 tests run against Julia 1.11.3:isvalid("ρ", 2) == false(the invalid boundary that caused the hang)thisindsnaps it correctly to byte1ρ,σ_val,αβ_t""instead of correct suffix)Base.si→sin,sign, etc.)