Skip to content

Improve scientific RAG citations and retrieval context#10

Open
sureshchouksey8 wants to merge 1 commit into
aietal:masterfrom
sureshchouksey8:codex/scientific-rag-bounty
Open

Improve scientific RAG citations and retrieval context#10
sureshchouksey8 wants to merge 1 commit into
aietal:masterfrom
sureshchouksey8:codex/scientific-rag-bounty

Conversation

@sureshchouksey8
Copy link
Copy Markdown

/claim #45

Summary

  • Add scientific RAG helpers for section-aware chunking, stable citation keys, bounded retrieval result counts, and retrieved-document formatting with distance context.
  • Store richer PDF chunk metadata during ingestion: title, page, source, section, chunk index, per-page chunk index, and citation key.
  • Use the request origin instead of hardcoded localhost for RAG document lookup, pass a configurable result count, and instruct answers to cite exact retrieved citation keys.
  • Respect CHROMA_PATH in document fetches and validate empty/non-POST requests.

Verification

  • npm test -- scientific-rag.test.ts --run
  • ./node_modules/.bin/tsc --noEmit
  • npm run lint passed with pre-existing React hook dependency warnings in unrelated files.

Notes

This focuses on grounding and citation traceability for scientific/research QA while staying within the existing Chroma/LangChain architecture.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant