This project is a full-stack web application that allows users to upload text-based content (from .txt, .md, or .pdf files, or direct text input) and ask natural language questions about it. The system uses a Generative AI model (configurable to be either Gemini or OpenAI) to generate answers that are strictly grounded in the uploaded content, thereby reducing the likelihood of hallucinations. This is achieved through Retrieval-Augmented Generation (RAG).
The application is built with a classic client-server architecture.
- Frontend: A single-page application (SPA) built with React. It provides the user interface for uploading documents and asking questions.
- Backend: A RESTful API built with Python and FastAPI. It handles the core logic, including document processing, embedding generation, and interaction with the AI models.
- Document Upload: The user uploads a document or pastes text via the React frontend. The content is sent to the
/api/uploadendpoint on the FastAPI backend. - Processing & Embedding: The backend receives the content, splits it into manageable chunks, and uses an embedding model (from either Gemini or OpenAI) to convert each chunk into a vector embedding.
- Vector Storage: These embeddings are stored in-memory in a FAISS (Facebook AI Similarity Search) vector store for efficient similarity searching.
- Question Answering: The user submits a question through the frontend, which is sent to the
/api/askendpoint. - Context Retrieval: The backend embeds the user's question and uses the FAISS vector store to find the most relevant text chunks from the original document (similarity search).
- Prompt Construction: The retrieved text chunks (context) and the user's question are combined into a single prompt for the language model. This prompt engineering guides the model to answer based on the provided text.
- AI Model Call: The backend sends the constructed prompt to the configured AI model (Gemini or OpenAI).
- Response Generation: The AI model generates an answer based on the context and question, which is then sent back to the frontend to be displayed to the user.
- Frontend:
- React (v18)
- Vite
- Axios
- CSS3
- Backend:
- Python 3.10+
- FastAPI
- Pydantic
- Uvicorn
- LangChain
- FAISS (for vector storage)
- OpenAI & Google Gemini APIs
- Language Models:
- Configurable to use either
gpt-3.5-turbo(OpenAI) orgemini-pro(Google).
- Configurable to use either
- Node.js and npm
- Python 3.10+ and pip
- An API key from either Google (for Gemini) or OpenAI.
Create a .env file in the root of the project by copying the .env.example:
cp .env.example .envEdit the .env file with your credentials and configuration:
# API Configuration
# Set the desired API provider: "GEMINI" or "OPENAI"
API_PROVIDER="GEMINI"
# API Keys
# Add your Google Gemini API key here
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"
# Add your OpenAI API key here
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"- Navigate to the
backenddirectory:cd backend - Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the required packages:
pip install -r requirements.txt
- Start the backend server:
The API will be available at
uvicorn app.main:app --reload
http://localhost:8000.
- Navigate to the
frontenddirectory:cd frontend - Install the required packages:
npm install
- Start the frontend development server:
The application will be available at
npm run dev
http://localhost:5173.
Endpoint: POST /api/upload
Form-Data with a file:
file: (binary content of a .txt, .md, or .pdf file)
Form-Data with text:
text: "This is the content of the document."
Endpoint: POST /api/ask
Request Body (application/json):
{
"question": "What is the main topic of the document?"
}Response Body:
{
"answer": "The main topic of the document is..."
}- Persistent Storage: Integrate a persistent vector database like Chroma or Milvus instead of the in-memory FAISS store.
- User Authentication: Add user accounts to manage and isolate documents.
- Chat History: Implement a feature to save and view the history of questions and answers for a given document.
- Dockerization: Containerize the frontend and backend applications for easier deployment.
- More File Types: Add support for more document formats like
.docx,.pptx, etc.
- Priyank
- GitHub: 13priyaank