Skip to content
This repository was archived by the owner on Feb 13, 2026. It is now read-only.
This repository was archived by the owner on Feb 13, 2026. It is now read-only.

Non-breaking space character (/u00A0) causes AssertionError #115

@lacymorrow

Description

@lacymorrow

Here is the problem string: Chatbot\u00a0\u2013

Traceback (most recent call last):
  File "<console>", line 5, in <module>
  File "/usr/local/lib/python3.6/site-packages/budou/parser.py", line 78, in parse
    chunks = self.segmenter.segment(source, language)
  File "/usr/local/lib/python3.6/site-packages/budou/tinysegmentersegmenter.py", line 94, in segment
    assert source[seek] == ' '
AssertionError

assert source[seek] == ' '

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions