A Corrective RAG pipeline that routes a question, retrieves supporting context from a local vector store, optionally falls back to web search, and runs graders (relevance + grounding + answer-quality) before returning an answer.
Principle: if an answer can’t be supported by evidence, the system should search / retry / refuse, rather than hallucinate.
Given an input question, the LangGraph state machine follows this pattern:
- Route the question (VectorStore/RAG vs Web Search) using an LLM router.
- Retrieve documents from a local Chroma vector store.
- Grade retrieved docs for relevance.
- If evidence is insufficient, Web Search (Tavily) and append results as an evidence
Document. - Generate an answer from the current evidence set.
- Grade the generation
- Is it grounded in the provided documents?
- Does it answer the question?
You can see the execution trace in console logs (e.g. ---ROUTE QUESTION---, ---RETRIEVE---, ---WEB SEARCH---, ---CHECK HALLUCINATIONS---, etc.).
Based on the current project layout shown in this repo listing.
-
main.py
Demo entrypoint (loads env + callsapp.invoke({"question": ...})). -
ingestion.py
Defines ingestion + vectorstore setup and exposes aretrieverused by the graph.
The local Chroma persistence directory is./.chroma/(gitignored). -
graph/graph.py
Builds and compiles the LangGraph app.
⚠️ Note: the current graph builder also exports a Mermaid PNG via:app.get_graph().draw_mermaid_png(output_file_path="graph.png") -
graph/state.py
The graph state schema. -
graph/node_constants.py
Node name constants (RETRIEVE,GRADE_DOCUMENTS,WEBSEARCH,GENERATE, …). -
graph/nodes/
Node implementations (retrieve,grade_documents,web_search,generate). -
graph/chains/
Prompt + LLM “chains” and graders (router, relevance grader, hallucination/grounding grader, answer grader). -
graph.png
Exported graph diagram (generated by the graph builder).
- Python 3.11+
- Environment variables:
OPENAI_API_KEY(LLM + embeddings)TAVILY_API_KEY(web search fallback)USER_AGENT(optional but recommended; silences some HTTP warnings)
- Optional (observability / LangSmith):
LANGCHAIN_TRACING_V2LANGCHAIN_PROJECTLANGCHAIN_API_KEY
python -m venv .venv
source .venv/bin/activateIf you have requirements.txt:
pip install -r requirements.txtOtherwise (minimal set matching current imports):
pip install -U python-dotenv langgraph langchain-openai langchain-chroma langchain-tavily langchain-community langchain-text-splitters chromadbCreate .env in repo root:
OPENAI_API_KEY=sk-xxxxxxxx
TAVILY_API_KEY=tvly-xxxxxxxx
USER_AGENT=ozgunes-corrective-rag/1.0
# Optional: LangSmith tracing
LANGCHAIN_TRACING_V2=true
LANGCHAIN_PROJECT=CorrectiveRAGProject
LANGCHAIN_API_KEY=lsv2_xxxxxA template is included as .env.example.
ingestion.py defines the ingestion pipeline (URLs/loaders → chunking → embeddings → Chroma persist to ./.chroma/).
Depending on your current implementation, the store may be created:
- by running
ingestion.pydirectly if it has a__main__entrypoint, or - on import when
retrieveris initialized.
python main.pyThis project has two evidence paths:
-
Curated corpus (Chroma)
Documents indexed by youringestion.py. -
Live web fallback (Tavily)
Triggered when graded evidence is insufficient. Tavily results are wrapped into alangchain_core.documents.Documentand appended to the evidence list.
Some models (notably gpt-3.5-turbo) don’t support OpenAI Structured Outputs via json_schema.
LangChain falls back to function_calling. This is safe; the fix is to explicitly set method="function_calling" when you build structured graders.
Prefer split packages (already used in your latest code):
from langchain_chroma import Chroma
from langchain_tavily import TavilySearchSet USER_AGENT in .env to silence the warning.
If your graders return string labels like "yes" / "no" in binary_score, make sure you check explicitly:
if score.binary_score == "yes":
...Because in Python, any non-empty string (including "no") is truthy.
- Add a small UI (
streamlit_app.py) with:- chat
- route badge (RAG vs Web)
- evidence panel (sources + snippets)
- grounded / not grounded indicators
- Standardize output schema:
answer,route,web_search_used,grounded,answers_question,sources[] - Add
eval/with a tiny test set and a report: grounding pass rate + answer-quality pass rate - Add a license
