- Each user has a unique ID and profile.
- Users can create, view, and manage multiple chat sessions.
- Each user can create multiple chat sessions.
- Each chat session stores metadata (
title,created_at) and is linked to its creator.
- Chat messages (user + assistant) are stored in the database.
- Each message is converted into a vector embedding using OpenAI embeddings (
text-embedding-3-small, 1536 dims).
- When a user sends a query, the system:
- Retrieves top similar past messages from that chat session via vector similarity search.
- Passes retrieved conversation snippets as context to the LLM.
- Generates an assistant response based on the context, maintaining chat continuity.
- User Router: Create users, fetch user info, list user chats.
- Chat Router: Create chats (user-linked or generic), send messages, fetch messages, delete chats, list all chats.
- Clear RESTful hierarchy:
/users/{user_id}/chats→ user-specific chats/chats/{chat_id}→ chat-specific operations
- PostgreSQL used as the main DB.
- Vector embeddings stored using
pgvector. ChatSession↔Userrelationship implemented.ChatMessage↔ChatSessionrelationship implemented.
NeuralFoundry is a hands-on, modular RAG playground that shows how real AI systems are built. The code is intentionally organized so you can study or swap components without rewriting the whole stack.
-
Chat Memory
We store each message (user + assistant) as text + vector embeddings. Retrieval pulls recent messages and semantically similar older messages to keep context. -
Knowledge Bases (KBs)
Each KB has documents stored with metadata. We embed chunks and retrieve the most relevant ones at query time. -
Chat Attachments
You can attach files directly to a chat. Those files are processed into chunks and stored as embeddings, then used as extra context just for that chat. -
Chunking Strategy
- PDFs, docx, images, HTML, etc. are processed using Docling for structure-aware chunking.
.txtand.mdare split directly with overlap (simple, fast, reliable).
-
Retrieval & Similarity Thresholds
Vector similarity search usespgvector.
Similarity thresholds (e.g., KB chunk threshold) are configurable in settings, so you can tune relevance vs. recall without changing code.
- Modular pipelines (chat pipeline, KB ingestion, attachment ingestion)
Easy to plug in other retrieval strategies: BM25, hybrid search, reranking, etc. - Config‑driven behavior
Model selection, embedding dimensions, chunk sizes, and thresholds can be adjusted centrally. - Metadata‑rich storage
We store metadata for chats, KBs, and attachments, which makes analytics and dashboards possible later.
- FastAPI for backend APIs
- OpenAI Python SDK for chat + embeddings
- PostgreSQL + pgvector for vector search
- Docker Compose for one‑command local dev
- React + Vite for the frontend
- Bash/Docker tooling for repeatable setup
- More Retrieval Methods
Add BM25, hybrid search, reranking, or query expansion. - Multi‑model Responses
Generate Response A / Response B from different models (or different prompts) and compare. - User Feedback Loop
Collect thumbs‑up/down and feed it into evaluation or reranking. - Analytics & Dashboards
Use the stored metadata to show most used KBs, attachment usage, query patterns, etc. - Agents / MCPs
Add tools, structured workflows, or multi‑step reasoning with agent frameworks.
This is the recommended way to run everything together: Postgres + pgvector, backend, frontend, and pgAdmin.
Create or edit /Users/thomaskuttyreji/Documents/GitHub/NeuralFoundry/.env:
POSTGRES_USER=neuralfoundry
POSTGRES_PASSWORD=neuralfoundry_pw
POSTGRES_DB=neuralfoundry
POSTGRES_HOST=localhost
POSTGRES_PORT=5432Set your OpenAI key in the macOS environment (not in .env):
export OPENAI_API_KEY="your_key_here"docker compose up --buildThis removes all Postgres data and starts clean.
docker compose down -v
docker compose up --buildThe backend prints a compact retrieval summary so you can quickly understand what context the model is using:
============================================================
🐛 RETRIEVAL RESULTS:
- Recent messages: 5
- Older messages: 0
- KB chunks: 0
- Attachment chunks: 0
⚠️ NO KB RESULTS FOUND!
⚠️ No KBs are attached to this chat!
ℹ️ No attachments found in this chat
============================================================
Here’s a minimal view of how data is connected:
User
└── ChatSession
├── ChatMessage (vector embedding)
├── ChatAttachment
│ └── ChatAttachmentChunk (vector embedding)
└── ChatSessionKB (links chat ↔ KB)
KnowledgeBase
└── KBDocument
└── KBChunk (vector embedding)
- Backend: http://localhost:8000
- Frontend: http://localhost:5173
- pgAdmin: http://localhost:8080
pgAdmin will auto-register a server named neuralfoundry.
If prompted for a password, use:
- Password:
neuralfoundry_pw
If you want to run backend/frontend locally:
docker run --name nf-postgres \
-e POSTGRES_USER=neuralfoundry \
-e POSTGRES_PASSWORD=neuralfoundry_pw \
-e POSTGRES_DB=neuralfoundry \
-p 5432:5432 \
-d pgvector/pgvector:pg16docker run --name nf-pgadmin \
-e PGADMIN_DEFAULT_EMAIL=thomaskuttyreji.1396@gmail.com \
-e PGADMIN_DEFAULT_PASSWORD=admin_pw \
-p 8080:80 \
-d dpage/pgadmin4- Hostname:
host.docker.internal - Port:
5432 - Username:
neuralfoundry - Password:
neuralfoundry_pw - Database:
neuralfoundry
cd /Users/thomaskuttyreji/Documents/GitHub/NeuralFoundry
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reloadFrontend:
cd /Users/thomaskuttyreji/Documents/GitHub/NeuralFoundry/frontend
npm install
npm run dev