-
Notifications
You must be signed in to change notification settings - Fork 23
Bug: UnboundLocalError crash in RAG fallback path + streaming endpoint yields None to clients #243
Description
While reviewing the backend source code, I found 4 bugs, 2 of which can cause runtime crashes or corrupt client output. These affect both the API server and the CLI chatbot.
Bug 1: UnboundLocalError crash in retriever_rag.py
File: backend/src/agents/retriever_rag.py
When inbuilt_tool_calling is False and the LLM response does not contain the key "tool_names", the variable tool_calls is never assigned, but is still referenced on line 177:
# Line 162-177
if "tool_names" in str(response):
tool_calls = response.get("tool_names", []) # ← only assigned inside this if-block
for tool in tool_calls:
if tool not in self.tool_names:
logging.warning(f"Tool {tool} not found in tool list.")
tool_calls.remove(tool)
else:
logging.warning(str(response))
logging.warning("Tool selection failed. Returning empty tool list.")
# ← tool_calls is NOT assigned here!
if "rephrased_question" in str(response):
state["messages"][-1].content = response.get("rephrased_question")
else:
logging.warning("Rephrased question not found in response.")
return {"tools": tool_calls} # ← 💥 UnboundLocalError if "tool_names" was not in response
``
Impact: Any user query that triggers the non-inbuilt-tool-calling path and the LLM returns a response without "tool_names" will crash the entire request with UnboundLocalError: local variable 'tool_calls' referenced before assignment.
Fix:
```python
# Initialize before the if/else block:
tool_calls: list = []
if "tool_names" in str(response):
tool_calls = response.get("tool_names", [])
...Additionally, the tool_calls.remove(tool) on line 167 mutates the list while iterating over it, which can skip elements. This should use a list comprehension filter instead.
Bug 2: Streaming endpoint yields None to clients
File: backend/src/api/routers/conversations.py
if chunk == "on_chat_model_stream" and current_llm_call_count == 2:
...
if isinstance(message_content, AIMessageChunk):
msg = message_content.content
else:
msg = None # ← msg is set to None
if msg:
chunks.append(str(msg))
yield str(msg) + "\n\n" # ← yields "None\n\n" to the client!The yield on line 408 is outside the if msg: guard, so when msg is None (non-AIMessageChunk events), the string "None\n\n" is streamed to the frontend.
Impact: Users see the literal text None randomly interspersed in streamed responses.
Fix:
if msg:
chunks.append(str(msg))
yield str(msg) + "\n\n" # ← move yield inside the guardBug 3: str.replace() result discarded, tool descriptions never cleaned
File: backend/src/agents/retriever_rag.py
text_desc = render_text_description([tool])
text_desc.replace("(query: str) -> Tuple[str, list[str], list[str]]", " ") # ← result not saved!
self.tool_descriptions += text_desc + "\n\n"Python strings are immutable, str.replace() returns a new string, but the result is never assigned back. The original text_desc (with the noisy type signature) is used as-is, making tool descriptions sent to the LLM unnecessarily verbose.
Fix:
text_desc = text_desc.replace("(query: str) -> Tuple[str, list[str], list[str]]", " ")Bug 4: In-memory chat_history dict grows unboundedly (memory leak)
File: backend/src/api/routers/conversations.py
chat_history: dict[UUID, list[dict[str, str]]] = {}When the database is disabled (USE_DB=false), every new conversation UUID is added to this global dict but never evicted. Over time, this causes unbounded memory growth in long-running deployments.
Suggestion: Use collections.OrderedDict with a maximum size, or an LRU cache (e.g., functools.lru_cache or cachetools.LRUCache), to cap the number of in-memory conversations.
I am happy to open a PR fixing all 4 bugs if this issue is confirmed. The fixes are straightforward and can be done in a single PR with corresponding test updates. please assign this issue to me @luarss