feat(chat): add RAG intent router with per-conversation attachment management#324
feat(chat): add RAG intent router with per-conversation attachment management#324abdulrafey1 wants to merge 12 commits into
Conversation
…nagement
Introduces an LLM-based intent router that inspects the user's message and
attached-file metadata to decide whether RAG retrieval should run for the
current turn. Previously RAG fired only when drive_file blocks were inlined
in the request; now it fires based on persisted per-conversation attachments
and a routing decision, reducing unnecessary MCP roundtrips on chit-chat.
- New chat_conversation_attachments join table (Alembic migration with
back-fill from active_drive_file_id/s columns)
- POST/DELETE /conversations/{uuid}/attachments endpoints
- RAGIntentRouter: fail-closed on LangChainException/ValidationError → 502
- SSE status events: scanning_attachments / skipping_rag
- 28 new tests (unit + attachment + integration), all passing
|
Claude finished @abdulrafey1's task in 3m 35s —— View job PR Review: RAG Intent Router with Per-Conversation Attachment Management
OverviewThis PR adds an intelligent RAG intent router that inspects user messages to decide whether document retrieval should run, removing the need for teachers to manually re-attach files each turn. It introduces a The overall design is sound — the router is cleanly separated, the fail-closed behavior (502 on router failure) is the right call, and the dual-write to both legacy columns and the new attachment table is a safe migration strategy. Previous Review CommentsNo prior review comments found. Critical (Blocking)
High
Medium
Low
What's Good
Test Coverage AssessmentCoverage is strong. Minor gaps:
SummaryThe architecture and intent are excellent. The one blocking issue is the SQLite-specific SQL in the migration back-fill — this must be fixed before merge or the migration will fail on any PostgreSQL environment. The high-priority items ( |
- Use dialect-aware SQL in back-fill migration (PostgreSQL vs SQLite) - Replace bare except with specific SQLAlchemyError + warning log - Fix deprecated datetime.utcnow → datetime.now(timezone.utc) - Chain HTTPException from RAGIntentRouterError to preserve traceback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
I'll analyze this and get back to you. |
Accessing _create_llm() from outside the class broke encapsulation. Add a stable public delegate so callers in routes.py are insulated from internal provider implementation changes.
…ate_llm - Use asyncio.gather in RAGIntentRouter.decide() so document structure calls for multiple attached files run concurrently instead of serially - Add BaseChatProvider.create_llm() public delegate so callers are insulated from the private _create_llm() implementation detail Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude finished @abdulrafey1's task in 2m 24s —— View job PR Review: RAG Intent Router (Post-Sync)
Previous Review CommentsAll 6 comments from the previous review have been addressed and resolved:
New Findings
What's Good (Unchanged)
SummaryAll 6 issues from the prior review are addressed cleanly. One new medium-priority finding: the |
A single FK violation (drive_file_id referencing a deleted file) aborted the whole PostgreSQL transaction, causing all subsequent inserts and the Alembic version update to fail. Wrapping each per-file insert in a nested savepoint means only that row is rolled back on error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Matches the model's datetime.now(timezone.utc) default and avoids silent timezone truncation on PostgreSQL (TIMESTAMPTZ vs TIMESTAMP). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude finished @abdulrafey1's task in 2m 15s —— View job PR Review: RAG Intent Router (Post-Sync)
Previous Review CommentsAll 7 comments from previous reviews have been addressed and resolved:
New Findings
What's Good
Test Coverage AssessmentCoverage is strong. Minor gaps:
SummaryAll 7 issues from previous reviews are cleanly addressed. The three new findings are medium/low priority — none are blocking. The detach endpoint ownership check (medium) is worth a quick fix for consistency. The PR is in good shape and ready to merge after addressing the detach endpoint finding. |
The detach endpoint only checked conversation ownership, not file ownership. A caller could detach a file they don't own if it was somehow attached to their conversation. Added the same ownership check as the attach endpoint. Also fix f-string logger in service.py -> % formatting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude finished @abdulrafey1's task in 8m 30s —— View job PR Review: RAG Intent Router with Per-Conversation Attachment Management
OverviewThis PR adds an intelligent RAG intent router that inspects user messages to decide whether document retrieval should run, removing the need for teachers to manually re-attach files each turn. It introduces a Previous Review CommentsAll 10 comments from previous reviews have been addressed:
Critical (Blocking)
Low
What's Good
Test Coverage AssessmentCoverage is strong for the individual components. The critical gap: integration tests mock both SummaryAll previous review issues are addressed. One new critical/blocking finding: the router-driven RAG flow doesn't correctly merge the user's query text with the retrieved document context before sending to the LLM. This affects both streaming and non-streaming paths — the core feature needs a fix before merge. The low-priority items (f-strings in loggers, redundant f-string) are quick cleanups. |
Router-driven RAG built synthetic messages with only drive_file blocks, causing three failures: empty similarity-search query, user question dropped from the non-streaming LLM prompt, and RAG context silently discarded in the streaming assembly pass. Fix by appending the query text block to synthetic messages and building the streaming current from unresolved_messages so drive_file blocks are visible to the in-stream replacement pass.
|
Claude finished @abdulrafey1's task in 4m 16s —— View job PR Review: RAG Intent Router (Post-Sync)
Previous Review CommentsAll 14 comments from previous reviews have been addressed and resolved:
New Findings
What's Good
Test Coverage AssessmentCoverage is strong. Minor gaps:
SummaryAll 14 issues from four previous review rounds are cleanly addressed. The critical RAG merge fix is correctly implemented — synthetic messages now carry both |
… to join table SQLModel inferred sa.DateTime() (no timezone) for attached_at despite the default producing timezone-aware datetimes — causing asyncpg to reject the insert. Added explicit sa_column=Column(DateTime(timezone=True)) to match the DB column type. Also sync inline drive_file blocks from existing conversations into the chat_conversation_attachments join table so the intent router can find them. Frontend still sends drive_file blocks inline; without this sync the router saw no attachments and skipped RAG on every turn. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Routes now call provider.create_llm() (public method). Mock setup for _create_llm had no effect — replaced with create_llm.return_value. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Claude finished @abdulrafey1's task in 4m 10s —— View job PR Review: RAG Intent Router with Per-Conversation Attachment Management
Previous Review CommentsAll 15 comments from five previous review rounds have been addressed and resolved:
New FindingsNo blocking, high, or medium issues found. The code is clean. What's Good
Test Coverage AssessmentCoverage is strong and well-organized:
SummaryAll 15 issues from five previous review rounds are cleanly addressed. No new issues found. The PR is well-structured, properly tested, and ready to merge. |
| logger = get_logger(__name__) | ||
|
|
||
| # Load system prompt at module import time | ||
| _SYSTEM_PROMPT = (Path(__file__).parent / "assets" / "rag_intent_router_system_prompt.txt").read_text() |
There was a problem hiding this comment.
Reading the prompt at import time means a missing or unreadable asset will crash app startup rather than just failing at chat-time. Easy to miss in a deploy.
| attachment_summary += f"\n- {file.name}:\n" | ||
| if isinstance(sections_or_exc, BaseException): | ||
| logger.warning( | ||
| "Failed to get document structure for file %d: %s", |
There was a problem hiding this comment.
file.id is typed int | None, so this %d format will raise TypeError if it's ever None, awkward when you're already inside an error path.
| conversation_id: int, | ||
| drive_file_id: int, | ||
| ) -> ConversationAttachment: | ||
| """Attach a drive file to a conversation (upsert-safe).""" |
There was a problem hiding this comment.
This helper calls session.commit() and session.rollback() directly, so it owns the outer request-scoped transaction. When chat_completion invokes it mid-handler, a rollback here will quietly drop any other unsaved state on the session.
| session: AsyncSession, | ||
| conversation_id: int, | ||
| ) -> list[DriveFile]: | ||
| """List ready drive files attached to a conversation.""" |
There was a problem hiding this comment.
No user_id filter on this query, it relies entirely on the caller to have validated conversation ownership. Fine today, but easy for a future call site to misuse.
| drive_file_ids=active_drive_file_ids, | ||
| ) | ||
| # Sync inline drive_file blocks into the join table so the router can find them. | ||
| for fid in active_drive_file_ids: |
There was a problem hiding this comment.
This same write-through loop already lives inside ChatService.create_conversation. Having it in two places means the next change has to remember to touch both spots.
| ) | ||
| else: | ||
| # Otherwise use request messages as-is (no RAG) | ||
| resolved_messages = request.messages |
There was a problem hiding this comment.
When the router decides to skip RAG and the request is non-streaming, request.messages flows through as-is. If the user sent inline drive_file content blocks this turn, they'll reach the LLM unredacted, the safety strip further down in stream_chat_response only runs on the streaming path.
| select(DriveFileModel).where( | ||
| DriveFileModel.id == drive_file_id, | ||
| DriveFileModel.user_id == current_user.id, | ||
| DriveFileModel.is_deleted == False, # noqa: E712 |
There was a problem hiding this comment.
Filtering on is_deleted == False here means once the underlying drive file gets soft-deleted, the user can no longer detach the attachment, and the orphan row in the join table becomes unreachable from the API.

What
Automates the RAG retrieval decision in the chat plugin so teachers are no longer forced to manually re-attach Drive files on each message turn. An intent router (a small, fast LLM step) inspects the user's latest message and lightweight section-title metadata from their conversation attachments, then decides whether RAG retrieval should run for that turn (binary: run / skip). RAG now activates automatically based on message intent rather than requiring per-message inline file selection.
Changes
feat(chat): addRAGIntentRouterclass (app/core_plugins/chat/intent_router.py) withRAGIntentRouterError; fails closed onLangChainException/ValidationErrorfeat(chat): addrag_intent_router_system_prompt.txtprompt assetfeat(chat): addConversationAttachmentSQLModel join table +attachmentsrelationship onConversation; legacy columns untouchedfeat(chat): addRAGRoutingDecision,ConversationAttachmentCreate,ConversationAttachmentResponseschemasfeat(chat): addattach_drive_file,detach_drive_file,list_conversation_attachmentsservice methodsfeat(chat): wire intent router intochat_completion; emitscanning_attachments/skipping_ragSSE events; addPOST/DELETE /conversations/{uuid}/attachmentsendpoints; 502 fail-closed handlerfeat(migrations): add Alembic migration forchat_conversation_attachmentsjoin table with back-fill from legacyactive_drive_file_id(s)columnsfix(core): add conditional pool kwargs inapp/core/db.pyfor SQLite test compatibilitytest(chat): 9 unit tests for intent router, 14 tests for attach/detach service + endpoints, 5 integration tests driving/api/v1/chat/completionsHow to Test
make migrationsmake dev.upmake test— expect 969 tests passingmake lint && make mypyPOST /api/v1/chat/conversations/{uuid}/attachmentsscanning_attachmentsSSE event followed by retrieved context in the completion)skipping_ragSSE event and no retrieval overhead)DELETE /api/v1/chat/conversations/{uuid}/attachments/{id}and confirm subsequent messages skip RAGNotes
Migration required: Run
make migrationsto apply the newchat_conversation_attachmentsjoin table. The migration includes a back-fill step from the legacyactive_drive_file_id/active_drive_file_idscolumns; those columns remain intact so no rollback risk.Breaking changes: None. Legacy columns are preserved.
No new environment variables required. The intent router reuses the same LLM configured for the main chat.