Skip to content

HoangHPham/LLM-RAG-basic-deploy-FastNext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Retrieval Augmented Generation (RAG) - Basic application with FastAPI, NextJS and MongoDB

A basic simulation of RAG system using MongoDB deployed by FastAPI and NextJS

til

Key concept:

image

Retrieval Augmented Generation (RAG)

There are 5 key components in a basic RAG:

  1. Document Loader:
  2. - PDF Loader
  3. Document Splitter:
  4. - Text Splitter
  5. Vector Embeddings:
  6. - Embedding model: HuggingFace Embedding Model - all-MiniLM-L6-v2-Q5_K_M.gguf
  7. Vector Store
  8. - MongoDB Atlas - Vector Stores
  9. Retrieval and Generation
  10. - Chat model: Hugging Face Chat Model - Phi-3-mini-4k-instruct-q4.gguf

Deployment

  1. Backend: FastAPI
  2. Frontend: NextJS

How to run?

!Note: The system need to be run with 2 components (backend and frontend) seperately:

  • Run Backend in the first terminal window:
  • cd backend python main.py

    !Important: Check installed libs in 'requirements.txt' and environment variables in '.env.example'

  • Run Frontend in the second terminal window:
  • cd frontend npm run dev

Shortcommings

  • The system is only allowed to upload only 1 document
  • The system use gguf model file to correspond to CPU
  • The system is seperated into 2 individual parts (backend and frontend) -> Docker can solve this!

What's next?

  • Advanced RAG: RAG with external API

About

A basic application of RAG system using MongoDB deployed by FastAPI and NextJS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors