Skip to content

fix: prevent crashes in jieba text_exists and rate_limit flush_cache#37548

Open
ifer47 wants to merge 1 commit into
langgenius:mainfrom
ifer47:fix/keyword-table-empty-dict-crash-and-rate-limit-race
Open

fix: prevent crashes in jieba text_exists and rate_limit flush_cache#37548
ifer47 wants to merge 1 commit into
langgenius:mainfrom
ifer47:fix/keyword-table-empty-dict-crash-and-rate-limit-race

Conversation

@ifer47

@ifer47 ifer47 commented Jun 16, 2026

Copy link
Copy Markdown

Summary

  • Jieba keyword table crash: text_exists() crashes with TypeError when _get_dataset_keyword_table() returns an empty dict {}, because set.union(*{}.values()) requires at least one argument. The method only guarded against None but _get_dataset_keyword_table returns {} at line 184. Fixed by changing the guard from keyword_table is None to not keyword_table.
  • Rate limiting TOCTOU race: flush_cache() had a race condition between redis_client.exists() and redis_client.get(). If the Redis key expires between the two calls, get() returns None and calling .decode("utf-8") on it raises AttributeError. Fixed by replacing the exists() + get() pattern with a single get() call and a None check.

Reproduction

Bug 1set.union on empty dict:

>>> set.union(*{}.values())
TypeError: unbound method set.union() needs an argument

This is reachable when _get_dataset_keyword_table() returns {}, which happens at line 184 when no keyword table data exists.

Bug 2 — Redis key expires between exists() and get():

# Key exists during exists() check, but expires before get()
cached = redis_client.get(key)  # Returns None
int(cached.decode("utf-8"))     # AttributeError: 'NoneType' object has no attribute 'decode'

Test plan

  • Added test for text_exists() with empty dict {} — verifies it returns False instead of crashing
  • Added test for flush_cache() when Redis key disappears — verifies it falls back to local value instead of crashing
  • Updated existing test_should_sync_max_requests_from_redis_on_subsequent_flush to remove stale exists.return_value mock (no longer needed after fix)

🤖 Generated with Claude Code Best

1. Jieba keyword table: `text_exists()` crashes with TypeError when
   `_get_dataset_keyword_table()` returns an empty dict `{}` because
   `set.union(*{}.values())` requires at least one argument. The method
   only guarded against `None` but `_get_dataset_keyword_table` also
   returns `{}` at line 184. Fixed by changing the guard from
   `keyword_table is None` to `not keyword_table`.

2. Rate limiting: `flush_cache()` has a TOCTOU race condition between
   `redis_client.exists()` and `redis_client.get()`. If the Redis key
   expires between the two calls, `get()` returns `None` and calling
   `.decode("utf-8")` on it raises `AttributeError`. Fixed by replacing
   the `exists()` + `get()` pattern with a single `get()` call and a
   `None` check on the result.

Co-Authored-By: zhipu/glm-5 <zai-org@claude-code-best.win>
@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant