A basic simulation of RAG system using MongoDB deployed by FastAPI and NextJS
There are 5 key components in a basic RAG:
- Document Loader: - PDF Loader
- Document Splitter: - Text Splitter
- Vector Embeddings: - Embedding model: HuggingFace Embedding Model - all-MiniLM-L6-v2-Q5_K_M.gguf
- Vector Store - MongoDB Atlas - Vector Stores
- Retrieval and Generation - Chat model: Hugging Face Chat Model - Phi-3-mini-4k-instruct-q4.gguf
- Backend: FastAPI
- Frontend: NextJS
!Note: The system need to be run with 2 components (backend and frontend) seperately:
- Run Backend in the first terminal window:
- Run Frontend in the second terminal window:
cd backend
python main.py
!Important: Check installed libs in 'requirements.txt' and environment variables in '.env.example'
cd frontend
npm run dev
- The system is only allowed to upload only 1 document
- The system use gguf model file to correspond to CPU
- The system is seperated into 2 individual parts (backend and frontend) -> Docker can solve this!
- Advanced RAG: RAG with external API

