Skip to content

feat(chat): add RAG intent router with per-conversation attachment management#324

Open
abdulrafey1 wants to merge 12 commits into
mainfrom
feat/rag-intent-router
Open

feat(chat): add RAG intent router with per-conversation attachment management#324
abdulrafey1 wants to merge 12 commits into
mainfrom
feat/rag-intent-router

Conversation

@abdulrafey1
Copy link
Copy Markdown
Contributor

What

Automates the RAG retrieval decision in the chat plugin so teachers are no longer forced to manually re-attach Drive files on each message turn. An intent router (a small, fast LLM step) inspects the user's latest message and lightweight section-title metadata from their conversation attachments, then decides whether RAG retrieval should run for that turn (binary: run / skip). RAG now activates automatically based on message intent rather than requiring per-message inline file selection.

Changes

  • feat(chat): add RAGIntentRouter class (app/core_plugins/chat/intent_router.py) with RAGIntentRouterError; fails closed on LangChainException/ValidationError
  • feat(chat): add rag_intent_router_system_prompt.txt prompt asset
  • feat(chat): add ConversationAttachment SQLModel join table + attachments relationship on Conversation; legacy columns untouched
  • feat(chat): add RAGRoutingDecision, ConversationAttachmentCreate, ConversationAttachmentResponse schemas
  • feat(chat): add attach_drive_file, detach_drive_file, list_conversation_attachments service methods
  • feat(chat): wire intent router into chat_completion; emit scanning_attachments / skipping_rag SSE events; add POST/DELETE /conversations/{uuid}/attachments endpoints; 502 fail-closed handler
  • feat(migrations): add Alembic migration for chat_conversation_attachments join table with back-fill from legacy active_drive_file_id(s) columns
  • fix(core): add conditional pool kwargs in app/core/db.py for SQLite test compatibility
  • test(chat): 9 unit tests for intent router, 14 tests for attach/detach service + endpoints, 5 integration tests driving /api/v1/chat/completions

How to Test

  1. Apply the migration: make migrations
  2. Start the stack: make dev.up
  3. Run the full test suite: make test — expect 969 tests passing
  4. Run lint and type-check: make lint && make mypy
  5. In the chat UI, attach a Drive file to a conversation via POST /api/v1/chat/conversations/{uuid}/attachments
  6. Send a message that clearly relates to the attached document — confirm RAG retrieval fires (look for scanning_attachments SSE event followed by retrieved context in the completion)
  7. Send a message that is clearly off-topic from the document — confirm RAG is skipped (look for skipping_rag SSE event and no retrieval overhead)
  8. Remove the attachment via DELETE /api/v1/chat/conversations/{uuid}/attachments/{id} and confirm subsequent messages skip RAG
  9. Test that an intent router failure returns HTTP 502 (simulate by providing an invalid LLM key)

Notes

Migration required: Run make migrations to apply the new chat_conversation_attachments join table. The migration includes a back-fill step from the legacy active_drive_file_id/active_drive_file_ids columns; those columns remain intact so no rollback risk.

Breaking changes: None. Legacy columns are preserved.

No new environment variables required. The intent router reuses the same LLM configured for the main chat.

…nagement

Introduces an LLM-based intent router that inspects the user's message and
attached-file metadata to decide whether RAG retrieval should run for the
current turn. Previously RAG fired only when drive_file blocks were inlined
in the request; now it fires based on persisted per-conversation attachments
and a routing decision, reducing unnecessary MCP roundtrips on chit-chat.

- New chat_conversation_attachments join table (Alembic migration with
  back-fill from active_drive_file_id/s columns)
- POST/DELETE /conversations/{uuid}/attachments endpoints
- RAGIntentRouter: fail-closed on LangChainException/ValidationError → 502
- SSE status events: scanning_attachments / skipping_rag
- 28 new tests (unit + attachment + integration), all passing
@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @abdulrafey1's task in 3m 35s —— View job


PR Review: RAG Intent Router with Per-Conversation Attachment Management

  • Check for previously addressed review comments — none found
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues (6 comments posted)
  • Post final summary

Overview

This PR adds an intelligent RAG intent router that inspects user messages to decide whether document retrieval should run, removing the need for teachers to manually re-attach files each turn. It introduces a ConversationAttachment join table, service methods for attach/detach, REST endpoints, and wires the router into the chat_completion flow with new SSE events (scanning_attachments / skipping_rag).

The overall design is sound — the router is cleanly separated, the fail-closed behavior (502 on router failure) is the right call, and the dual-write to both legacy columns and the new attachment table is a safe migration strategy.

Previous Review Comments

No prior review comments found.


Critical (Blocking)

# File Issue
1 migration Back-fill uses SQLite-only SQLINSERT OR IGNORE and datetime('now') will fail on PostgreSQL in production. Also contains bare except Exception: pass, violating project exception handling rules. Must use dialect-aware SQL.

High

# File Issue
2 models.py:121 datetime.utcnow is deprecated — Returns a naive datetime, inconsistent with timezone-aware created_at/updated_at from TimestampedModel. Use datetime.now(timezone.utc).
3 routes.py:616-621 Missing from e on exception re-raiseRAGIntentRouterError handler raises HTTPException without chaining, losing the original traceback for debugging.

Medium

# File Issue
4 routes.py:472 Accessing private _create_llm() — Called 3 times from outside the provider class. Consider exposing a public interface to avoid fragile coupling.
5 intent_router.py:68-92 Sequential get_document_structure calls — Each attached file triggers a serial DB call. Use asyncio.gather for parallelism when multiple files are attached.

Low

# File Issue
6 intent_router.py:85 Dead else: pass branch — Remove or replace with meaningful fallback text.

What's Good

  • Clean separationRAGIntentRouter is a focused, testable class with well-defined error boundaries
  • Fail-closed design — Router failure surfaces as HTTP 502, never silently skips retrieval
  • Upsert-safe attachattach_drive_file handles race conditions via check-then-insert with IntegrityError fallback
  • Thorough ownership checks — Both attach and detach endpoints verify conversation and file ownership
  • Solid test coverage — 28 new tests: 9 unit (router), 7 service, 7 endpoint, 5 integration covering happy paths, error paths, idempotency, and the 502 fail-closed behavior
  • db.py SQLite compatibility fix — Conditional pool kwargs is a clean approach for test compatibility
  • SSE event designscanning_attachments and skipping_rag events give the frontend clear signals for UX feedback

Test Coverage Assessment

Coverage is strong. Minor gaps:

  • No test for router behavior with empty query string
  • No test verifying the attached_at timestamp uses the correct timezone
  • Integration tests rely heavily on mocking — consider a thinner integration test that exercises the real router with a mocked LLM but real DB queries

Summary

The architecture and intent are excellent. The one blocking issue is the SQLite-specific SQL in the migration back-fill — this must be fixed before merge or the migration will fail on any PostgreSQL environment. The high-priority items (datetime.utcnow deprecation, missing exception chaining) are quick fixes that should be addressed in this PR.

Comment thread app/core_plugins/chat/models.py Outdated
Comment thread app/core_plugins/chat/routes.py Outdated
Comment thread app/core_plugins/chat/routes.py Outdated
Comment thread app/core_plugins/chat/intent_router.py Outdated
Comment thread app/core_plugins/chat/intent_router.py Outdated
Abdul Rafey and others added 2 commits May 13, 2026 17:12
- Use dialect-aware SQL in back-fill migration (PostgreSQL vs SQLite)
- Replace bare except with specific SQLAlchemyError + warning log
- Fix deprecated datetime.utcnow → datetime.now(timezone.utc)
- Chain HTTPException from RAGIntentRouterError to preserve traceback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

Abdul Rafey and others added 2 commits May 13, 2026 17:21
Accessing _create_llm() from outside the class broke encapsulation.
Add a stable public delegate so callers in routes.py are insulated
from internal provider implementation changes.
…ate_llm

- Use asyncio.gather in RAGIntentRouter.decide() so document structure
  calls for multiple attached files run concurrently instead of serially
- Add BaseChatProvider.create_llm() public delegate so callers are
  insulated from the private _create_llm() implementation detail

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @abdulrafey1's task in 2m 24s —— View job


PR Review: RAG Intent Router (Post-Sync)

  • Check previously addressed review comments
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues (1 new comment posted)
  • Post final summary

Previous Review Comments

All 6 comments from the previous review have been addressed and resolved:

# Issue Status
1 Migration back-fill uses SQLite-only SQL ✅ Fixed — dialect-aware branches + SQLAlchemyError catch
2 datetime.utcnow deprecated ✅ Fixed — now datetime.now(timezone.utc)
3 Missing from e on exception re-raise ✅ Fixed — raise HTTPException(...) from e
4 Accessing private _create_llm() ✅ Fixed — public create_llm() added to BaseChatProvider
5 Sequential get_document_structure calls ✅ Fixed — uses asyncio.gather() with return_exceptions=True
6 Dead else: pass branch ✅ Fixed — removed; exception results handled via isinstance check

New Findings

# Priority File Issue
1 Medium migration:35 attached_at column missing timezone=True — inconsistent with created_at/updated_at, will create TIMESTAMP WITHOUT TIME ZONE on PostgreSQL while model defaults to timezone-aware datetime

What's Good (Unchanged)

  • Clean separation of concerns — RAGIntentRouter is a focused, testable class
  • Fail-closed design — router failure surfaces as HTTP 502
  • Upsert-safe attach_drive_file with IntegrityError race-condition handling
  • Ownership verification on all endpoints
  • asyncio.gather() now parallelizes file structure lookups (addressed from prior review)
  • Public create_llm() API gives a stable interface for external callers
  • Solid test suite: 28 tests covering unit, service, endpoint, and integration layers

Summary

All 6 issues from the prior review are addressed cleanly. One new medium-priority finding: the attached_at column should use sa.DateTime(timezone=True) to match the other timestamp columns and the model's timezone-aware default. This is a one-line fix in the migration. Otherwise the PR looks ready.

A single FK violation (drive_file_id referencing a deleted file) aborted
the whole PostgreSQL transaction, causing all subsequent inserts and the
Alembic version update to fail. Wrapping each per-file insert in a nested
savepoint means only that row is rolled back on error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Matches the model's datetime.now(timezone.utc) default and avoids
silent timezone truncation on PostgreSQL (TIMESTAMPTZ vs TIMESTAMP).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @abdulrafey1's task in 2m 15s —— View job


PR Review: RAG Intent Router (Post-Sync)

  • Check previously addressed review comments
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues (3 new comments posted)
  • Post final summary

Previous Review Comments

All 7 comments from previous reviews have been addressed and resolved:

# Issue Status
1 Migration back-fill uses SQLite-only SQL ✅ Fixed — dialect-aware branches + SQLAlchemyError catch with savepoints
2 datetime.utcnow deprecated ✅ Fixed — now datetime.now(timezone.utc)
3 Missing from e on exception re-raise ✅ Fixed — raise HTTPException(...) from e
4 Accessing private _create_llm() ✅ Fixed — public create_llm() added to BaseChatProvider
5 Sequential get_document_structure calls ✅ Fixed — uses asyncio.gather() with return_exceptions=True
6 Dead else: pass branch ✅ Fixed — removed; exception results handled via isinstance check
7 attached_at column missing timezone=True ✅ Fixed — sa.DateTime(timezone=True) in migration

New Findings

# Priority File Issue
1 Medium routes.py:1175 Detach endpoint doesn't verify the caller owns the drive file — inconsistent with the attach endpoint's ownership check
2 Low routes.py:471 Router called with empty query_text when message has no text content — wastes an LLM call that can't make a meaningful decision
3 Low service.py:54 f-strings in logger calls should use %s formatting for deferred interpolation

What's Good

  • Clean separation of concernsRAGIntentRouter is a focused, testable class with well-defined error boundaries
  • Fail-closed design — Router failure surfaces as HTTP 502, never silently skips retrieval
  • Upsert-safe attachattach_drive_file handles race conditions via check-then-insert with IntegrityError fallback
  • Ownership verification on attach endpoint — validates both conversation and file ownership
  • asyncio.gather() now parallelizes file structure lookups with return_exceptions=True
  • Public create_llm() API gives a stable interface for external callers
  • Solid test suite — 28 tests: 9 unit (router), 7 service, 7 endpoint, 5 integration covering happy paths, error paths, idempotency, and 502 fail-closed behavior
  • SSE event designscanning_attachments and skipping_rag events give the frontend clear signals for UX feedback
  • Migration back-fill — dialect-aware SQL with savepoints isolating FK failures per row

Test Coverage Assessment

Coverage is strong. Minor gaps:

  • No test for router behavior with empty query string
  • Integration tests rely heavily on mocking — consider a thinner integration test that exercises the real router with a mocked LLM but real DB queries

Summary

All 7 issues from previous reviews are cleanly addressed. The three new findings are medium/low priority — none are blocking. The detach endpoint ownership check (medium) is worth a quick fix for consistency. The PR is in good shape and ready to merge after addressing the detach endpoint finding.

Comment thread app/core_plugins/chat/routes.py
Comment thread app/core_plugins/chat/routes.py Outdated
Comment thread app/core_plugins/chat/service.py Outdated
The detach endpoint only checked conversation ownership, not file
ownership. A caller could detach a file they don't own if it was
somehow attached to their conversation. Added the same ownership
check as the attach endpoint.

Also fix f-string logger in service.py -> % formatting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 13, 2026

Claude finished @abdulrafey1's task in 8m 30s —— View job


PR Review: RAG Intent Router with Per-Conversation Attachment Management

  • Check for previously addressed review comments
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues (4 comments posted)
  • Post final summary

Overview

This PR adds an intelligent RAG intent router that inspects user messages to decide whether document retrieval should run, removing the need for teachers to manually re-attach files each turn. It introduces a ConversationAttachment join table, service methods for attach/detach, REST endpoints, and wires the router into the chat_completion flow with new SSE events (scanning_attachments / skipping_rag).

Previous Review Comments

All 10 comments from previous reviews have been addressed:

# Issue Status
1 Migration back-fill uses SQLite-only SQL ✅ Fixed — dialect-aware branches + SQLAlchemyError catch with savepoints
2 datetime.utcnow deprecated ✅ Fixed — datetime.now(timezone.utc)
3 Missing from e on exception re-raise ✅ Fixed
4 Accessing private _create_llm() ✅ Fixed — public create_llm() on BaseChatProvider
5 Sequential get_document_structure calls ✅ Fixed — asyncio.gather() with return_exceptions=True
6 Dead else: pass branch ✅ Removed
7 attached_at column missing timezone=True ✅ Fixed
8 Detach endpoint doesn't verify file ownership ✅ Fixed — ownership check added
9 query_text can be empty when router is called ✅ Fixed — if attached_files and query_text: guard
10 f-string in logger calls ⚠️ Partially — line 54 fixed, line 205 still has f-string

Critical (Blocking)

# File Issue
1 routes.py:492-523 Router-driven RAG doesn't merge user query with retrieved context — three cascading failures: (a) non-streaming drops the user's question from the LLM prompt (replaced by RAG context only), (b) streaming fetches RAG context but the assembly pass discards it because messages has no drive_file blocks to find and replace, (c) both paths search with an empty query because synthetic messages lack text blocks. The core feature doesn't work correctly in either streaming or non-streaming mode.

Low

# File Issue
2 service.py:205 f-string still present in logger.warning (partially addressed from prior review)
3 service.py:243-245 f-string in logger.error in new attach_drive_file code
4 intent_router.py:89 Redundant f"{query}" — should just be query

What's Good

  • Clean separationRAGIntentRouter is a focused, testable class with well-defined error boundaries
  • Fail-closed design — Router failure surfaces as HTTP 502, never silently skips retrieval
  • Upsert-safe attachattach_drive_file handles race conditions via check-then-insert with IntegrityError fallback
  • Full ownership verification — Both attach and detach endpoints now verify conversation and file ownership
  • asyncio.gather() parallelizes file structure lookups with return_exceptions=True
  • Public create_llm() API — stable interface for external callers
  • SSE event designscanning_attachments and skipping_rag events give the frontend clear signals
  • Migration back-fill — dialect-aware SQL with savepoints isolating FK failures per row
  • Solid test suite — 28 tests covering unit, service, endpoint, and integration layers

Test Coverage Assessment

Coverage is strong for the individual components. The critical gap: integration tests mock both _resolve_drive_file_blocks and provider.send_message / stream_chat_response, so they verify the router is called but don't verify the messages actually sent to the LLM contain both the user's query and the RAG context. This is how the critical bug above went undetected. Adding a thinner integration test that asserts on the messages argument passed to provider.send_message would catch this class of issue.

Summary

All previous review issues are addressed. One new critical/blocking finding: the router-driven RAG flow doesn't correctly merge the user's query text with the retrieved document context before sending to the LLM. This affects both streaming and non-streaming paths — the core feature needs a fix before merge. The low-priority items (f-strings in loggers, redundant f-string) are quick cleanups.

Comment thread app/core_plugins/chat/routes.py
Comment thread app/core_plugins/chat/service.py Outdated
Comment thread app/core_plugins/chat/service.py Outdated
Comment thread app/core_plugins/chat/intent_router.py Outdated
Abdul Rafey added 2 commits May 14, 2026 14:15
Router-driven RAG built synthetic messages with only drive_file blocks,
causing three failures: empty similarity-search query, user question
dropped from the non-streaming LLM prompt, and RAG context silently
discarded in the streaming assembly pass.

Fix by appending the query text block to synthetic messages and building
the streaming current from unresolved_messages so drive_file blocks are
visible to the in-stream replacement pass.
@claude
Copy link
Copy Markdown

claude Bot commented May 14, 2026

Claude finished @abdulrafey1's task in 4m 16s —— View job


PR Review: RAG Intent Router (Post-Sync)

  • Check previously addressed review comments
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues (1 new comment posted)
  • Post final summary

Previous Review Comments

All 14 comments from previous reviews have been addressed and resolved:

# Issue Status
1 Migration back-fill uses SQLite-only SQL ✅ Fixed — dialect-aware branches + SQLAlchemyError catch with savepoints
2 datetime.utcnow deprecated ✅ Fixed — datetime.now(timezone.utc)
3 Missing from e on exception re-raise ✅ Fixed — raise HTTPException(...) from e
4 Accessing private _create_llm() ✅ Fixed — public create_llm() on BaseChatProvider
5 Sequential get_document_structure calls ✅ Fixed — asyncio.gather() with return_exceptions=True
6 Dead else: pass branch ✅ Removed
7 attached_at column missing timezone=True ✅ Fixed
8 Detach endpoint doesn't verify file ownership ✅ Fixed — ownership check added
9 query_text can be empty when router is called ✅ Fixed — if attached_files and query_text: guard
10 f-string in logger.info/logger.warning (service.py) ✅ Fixed — both lines now use %s formatting
11 Critical: Router-driven RAG doesn't merge user query ✅ Fixed — synthetic messages now include text_block with query_text; streaming path passes synthetic messages as current
12 f-string still present in logger.warning ✅ Fixed
13 f-string in logger.error (attach service) ✅ Fixed
14 Redundant f"{query}" ✅ Fixed — now just query

New Findings

# Priority File Issue
1 Low test_intent_router_integration.py:130 _create_llm mock setup is dead after create_llm() refactor — 5 occurrences should reference create_llm instead

What's Good

  • Clean separation of concernsRAGIntentRouter is a focused, testable class with well-defined error boundaries
  • Fail-closed design — Router failure surfaces as HTTP 502, never silently skips retrieval
  • Upsert-safe attach_drive_file — handles race conditions via check-then-insert with IntegrityError fallback
  • Full ownership verification — Both attach and detach endpoints verify conversation and file ownership
  • Correct RAG merge — Synthetic messages now include both drive_file blocks and the user's query_text, fixing the critical issue from the prior review. Both streaming and non-streaming paths correctly deliver RAG context + user question to the LLM
  • asyncio.gather() — Parallelizes file structure lookups with return_exceptions=True
  • Public create_llm() API — Stable interface for external callers
  • SSE event designscanning_attachments and skipping_rag events give the frontend clear signals for UX feedback
  • Migration back-fill — Dialect-aware SQL with savepoints isolating FK failures per row
  • Solid test suite — 28 tests: 9 unit (router), 7 service, 7 endpoint, 5 integration covering happy paths, error paths, idempotency, and 502 fail-closed behavior

Test Coverage Assessment

Coverage is strong. Minor gaps:

  • Integration tests still mock at a high level (_resolve_drive_file_blocks, provider.send_message), so they verify the router is called and wired correctly but don't assert on the actual messages argument sent to the LLM. A thinner integration test asserting on message content would catch any future regressions in the merge logic.

Summary

All 14 issues from four previous review rounds are cleanly addressed. The critical RAG merge fix is correctly implemented — synthetic messages now carry both drive_file blocks and the user's query text, and both streaming and non-streaming paths deliver the combined context to the LLM. The one new finding (dead _create_llm mock setup in tests) is low-priority housekeeping. The PR looks ready to merge.

Comment thread tests/chat/test_intent_router_integration.py
Abdul Rafey and others added 2 commits May 14, 2026 15:51
… to join table

SQLModel inferred sa.DateTime() (no timezone) for attached_at despite the
default producing timezone-aware datetimes — causing asyncpg to reject the
insert. Added explicit sa_column=Column(DateTime(timezone=True)) to match
the DB column type.

Also sync inline drive_file blocks from existing conversations into the
chat_conversation_attachments join table so the intent router can find them.
Frontend still sends drive_file blocks inline; without this sync the router
saw no attachments and skipped RAG on every turn.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Routes now call provider.create_llm() (public method). Mock setup for
_create_llm had no effect — replaced with create_llm.return_value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown

claude Bot commented May 14, 2026

Claude finished @abdulrafey1's task in 4m 10s —— View job


PR Review: RAG Intent Router with Per-Conversation Attachment Management

  • Check for previously addressed review comments — all 15 resolved
  • Read all changed files and gather context
  • Security review
  • Correctness and logic review
  • Performance analysis
  • Maintainability and patterns review
  • Test coverage assessment
  • Post inline comments on specific issues — none needed
  • Post final summary

Previous Review Comments

All 15 comments from five previous review rounds have been addressed and resolved:

# Issue Status
1 Migration back-fill uses SQLite-only SQL ✅ Fixed — dialect-aware branches + SQLAlchemyError catch with savepoints
2 datetime.utcnow deprecated ✅ Fixed — datetime.now(timezone.utc)
3 Missing from e on exception re-raise ✅ Fixed — raise HTTPException(...) from e
4 Accessing private _create_llm() ✅ Fixed — public create_llm() on BaseChatProvider
5 Sequential get_document_structure calls ✅ Fixed — asyncio.gather() with return_exceptions=True
6 Dead else: pass branch ✅ Removed
7 attached_at column missing timezone=True ✅ Fixed
8 Detach endpoint doesn't verify file ownership ✅ Fixed — ownership check added
9 query_text can be empty when router is called ✅ Fixed — if attached_files and query_text: guard
10 f-string in logger.info/logger.warning (service.py) ✅ Fixed — all use %s formatting
11 Critical: Router-driven RAG doesn't merge user query ✅ Fixed — synthetic messages include text_block with query_text; streaming path passes synthetic messages as current
12 f-string still present in logger.warning ✅ Fixed
13 f-string in logger.error (attach service) ✅ Fixed
14 Redundant f"{query}" ✅ Fixed — now just query
15 _create_llm mock setup is dead in integration tests ✅ Fixed — references updated to create_llm

New Findings

No blocking, high, or medium issues found. The code is clean.


What's Good

  • Clean separation of concernsRAGIntentRouter is a focused, testable class with well-defined error boundaries. The RAGIntentRouterError exception is properly chained via from exc
  • Fail-closed design — Router failure surfaces as HTTP 502, never silently skips retrieval. This is the right call for a retrieval system
  • Correct RAG merge — Synthetic messages carry both drive_file blocks and the user's query_text, correctly delivering combined context to the LLM in both streaming and non-streaming paths
  • Safety striproutes.py:813-817 ensures drive_file blocks never reach the LLM API, even if resolution fails silently
  • Upsert-safe attach_drive_file — Handles race conditions via check-then-insert with IntegrityError fallback and rollback
  • Full ownership verification — Both attach and detach endpoints verify conversation and file ownership before acting
  • asyncio.gather() — Parallelizes file structure lookups with return_exceptions=True for graceful per-file degradation
  • Public create_llm() API — Stable interface for external callers, replacing direct access to _create_llm()
  • SSE event designscanning_attachments and skipping_rag events give the frontend clear, actionable signals for UX feedback
  • Migration back-fill — Dialect-aware SQL (SQLite vs PostgreSQL) with savepoints isolating FK failures per row. Malformed JSON is handled gracefully
  • db.py SQLite compatibility — Conditional pool kwargs is a clean approach for test compatibility without affecting production
  • conftest.py env vars — Setting env vars before app imports prevents settings cache issues at test collection time
  • Solid test suite — 28 tests: 9 unit (router), 7 service, 7 endpoint, 5 integration covering happy paths, error paths, idempotency, ownership checks, and 502 fail-closed behavior

Test Coverage Assessment

Coverage is strong and well-organized:

  • Unit tests cover router decision paths, exception chaining, and prompt construction
  • Service tests verify attach/detach semantics, idempotency, READY-only filtering, and silent no-op detach
  • Endpoint tests cover 201/204 responses, ownership validation, missing resources, and duplicate handling
  • Integration tests verify router wiring into the full chat_completion flow for both streaming and non-streaming paths

Summary

All 15 issues from five previous review rounds are cleanly addressed. No new issues found. The PR is well-structured, properly tested, and ready to merge.


logger = get_logger(__name__)

# Load system prompt at module import time
_SYSTEM_PROMPT = (Path(__file__).parent / "assets" / "rag_intent_router_system_prompt.txt").read_text()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the prompt at import time means a missing or unreadable asset will crash app startup rather than just failing at chat-time. Easy to miss in a deploy.

attachment_summary += f"\n- {file.name}:\n"
if isinstance(sections_or_exc, BaseException):
logger.warning(
"Failed to get document structure for file %d: %s",
Copy link
Copy Markdown
Contributor

@hamza-56 hamza-56 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file.id is typed int | None, so this %d format will raise TypeError if it's ever None, awkward when you're already inside an error path.

conversation_id: int,
drive_file_id: int,
) -> ConversationAttachment:
"""Attach a drive file to a conversation (upsert-safe)."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper calls session.commit() and session.rollback() directly, so it owns the outer request-scoped transaction. When chat_completion invokes it mid-handler, a rollback here will quietly drop any other unsaved state on the session.

session: AsyncSession,
conversation_id: int,
) -> list[DriveFile]:
"""List ready drive files attached to a conversation."""
Copy link
Copy Markdown
Contributor

@hamza-56 hamza-56 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No user_id filter on this query, it relies entirely on the caller to have validated conversation ownership. Fine today, but easy for a future call site to misuse.

drive_file_ids=active_drive_file_ids,
)
# Sync inline drive_file blocks into the join table so the router can find them.
for fid in active_drive_file_ids:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same write-through loop already lives inside ChatService.create_conversation. Having it in two places means the next change has to remember to touch both spots.

)
else:
# Otherwise use request messages as-is (no RAG)
resolved_messages = request.messages
Copy link
Copy Markdown
Contributor

@hamza-56 hamza-56 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the router decides to skip RAG and the request is non-streaming, request.messages flows through as-is. If the user sent inline drive_file content blocks this turn, they'll reach the LLM unredacted, the safety strip further down in stream_chat_response only runs on the streaming path.

select(DriveFileModel).where(
DriveFileModel.id == drive_file_id,
DriveFileModel.user_id == current_user.id,
DriveFileModel.is_deleted == False, # noqa: E712
Copy link
Copy Markdown
Contributor

@hamza-56 hamza-56 May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filtering on is_deleted == False here means once the underlying drive file gets soft-deleted, the user can no longer detach the attachment, and the orphan row in the join table becomes unreachable from the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants