Skip to content

Improve RAG context budgeting#7

Open
Treasure520520 wants to merge 3 commits into
aietal:masterfrom
Treasure520520:bounty/isaac-45-context-budget
Open

Improve RAG context budgeting#7
Treasure520520 wants to merge 3 commits into
aietal:masterfrom
Treasure520520:bounty/isaac-45-context-budget

Conversation

@Treasure520520
Copy link
Copy Markdown

@Treasure520520 Treasure520520 commented May 14, 2026

Part of the open Algora bounty for [ISAAC-497] Implement an enhanced RAG Pipeline for Scientific/Research Workflows.

/claim #45

Bounty reference: https://algora.io/isaac/bounties/clq18zr98000ejs0gt0nv7gwu

Summary

  • add a shared RAG context preparation helper that deduplicates repeated chunks and keeps uploaded-document context inside a configurable character budget
  • increase Chroma retrieval from 4 to a bounded default of 8 candidates while capping callers at 20 results
  • return prepared context metadata from fetch-documents so rag-chat can use a stable context string instead of formatting raw Chroma arrays inline
  • add focused tests for source formatting, duplicate suppression, budget truncation, empty retrieval, and bounded integer parsing

Why this helps the scientific RAG bounty

Scientific PDFs often contain repeated abstracts, headers, references, or near-identical chunks. The current RAG path sends the first 4 raw Chroma matches directly into the prompt without deduplication or a context budget, which can waste context and make citations noisier.

This PR is intentionally a scoped reliability slice that complements the larger citation/reranking PRs already open. It keeps the current Chroma + LangChain architecture, avoids new dependencies, and improves the quality of the context that reaches the model.

Validation

  • npm install --no-audit --no-fund
  • npx vitest run __tests__/rag-context.test.ts
  • npx tsc --noEmit --pretty false
  • npx prettier --check pages/api/fetch-documents.ts pages/api/rag-chat.ts utils/server/rag-context.ts __tests__/rag-context.test.ts

@jing11223344
Copy link
Copy Markdown

I would like to work on this bounty. /claim https://algora.io/isaac/bounties/clq18zr98000ejs0gt0nv7gwu

I will implement:

  • A shared RAG context preparation helper for deduplication and budget management
  • Configurable character budget for context
  • Increase Chroma nResults from 4 to 8 (capped at 20)
  • Return prepared context metadata from fetch-documents
  • Comprehensive tests for all new functionality

@jing11223344
Copy link
Copy Markdown

🚀 PR #12 is ready for review! Implements all requirements:

  • Shared RAG context helper with dedup + budget
  • Configurable character budget (env override)
  • Increased Chroma nResults from 4→8 (capped at 20)
  • Prepared context metadata from fetch-documents
  • 25 comprehensive tests (all passing ✅)

#12

@Treasure520520
Copy link
Copy Markdown
Author

Pushed a focused follow-up commit (f4c80df) to tighten the bounty-aligned implementation:

  • fetch-documents now applies the configurable RAG context character budget from contextCharLimit or RAG_CONTEXT_CHAR_LIMIT, capped at 50k.
  • rag-chat now calls fetch-documents through the current request origin instead of hard-coded localhost, which makes the RAG path safer for hosted/Docker/proxy deployments.
  • Document fetch failures now degrade to the existing empty-context message instead of failing the whole chat request.
  • Added regression coverage for context budget parsing.

Validation re-run:

  • npx vitest run __tests__/rag-context.test.ts
  • npx tsc --noEmit --pretty false
  • npx prettier --check pages/api/fetch-documents.ts pages/api/rag-chat.ts utils/server/rag-context.ts __tests__/rag-context.test.ts
  • git diff --check

@Treasure520520
Copy link
Copy Markdown
Author

Pushed a small compatibility follow-up (65df325) after re-checking the bounty surface and the competing implementation shape.

fetch-documents now returns the prepared RAG context in both forms:

  • the existing flat fields (context, sourceCount, omittedSourceCount) for backward compatibility
  • a _prepared object for callers that want the prepared context and metadata grouped explicitly

rag-chat now prefers _prepared.context when available, then falls back to context, then falls back to preparing raw Chroma results. This keeps the current API stable while making the prepared context contract clearer for scientific RAG callers.

Validation re-run:

  • npx vitest run __tests__/rag-context.test.ts
  • npx prettier --check pages/api/fetch-documents.ts pages/api/rag-chat.ts utils/server/rag-context.ts __tests__/rag-context.test.ts
  • npx tsc --noEmit --pretty false
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants