AI-Powered Document Question Answering System

Project Overview

This project is a full-stack web application that allows users to upload text-based content (from .txt, .md, or .pdf files, or direct text input) and ask natural language questions about it. The system uses a Generative AI model (configurable to be either Gemini or OpenAI) to generate answers that are strictly grounded in the uploaded content, thereby reducing the likelihood of hallucinations. This is achieved through Retrieval-Augmented Generation (RAG).

Architecture

The application is built with a classic client-server architecture.

Frontend: A single-page application (SPA) built with React. It provides the user interface for uploading documents and asking questions.
Backend: A RESTful API built with Python and FastAPI. It handles the core logic, including document processing, embedding generation, and interaction with the AI models.

Data Flow

Document Upload: The user uploads a document or pastes text via the React frontend. The content is sent to the /api/upload endpoint on the FastAPI backend.
Processing & Embedding: The backend receives the content, splits it into manageable chunks, and uses an embedding model (from either Gemini or OpenAI) to convert each chunk into a vector embedding.
Vector Storage: These embeddings are stored in-memory in a FAISS (Facebook AI Similarity Search) vector store for efficient similarity searching.
Question Answering: The user submits a question through the frontend, which is sent to the /api/ask endpoint.
Context Retrieval: The backend embeds the user's question and uses the FAISS vector store to find the most relevant text chunks from the original document (similarity search).
Prompt Construction: The retrieved text chunks (context) and the user's question are combined into a single prompt for the language model. This prompt engineering guides the model to answer based on the provided text.
AI Model Call: The backend sends the constructed prompt to the configured AI model (Gemini or OpenAI).
Response Generation: The AI model generates an answer based on the context and question, which is then sent back to the frontend to be displayed to the user.

Tech Stack

Frontend:
- React (v18)
- Vite
- Axios
- CSS3
Backend:
- Python 3.10+
- FastAPI
- Pydantic
- Uvicorn
- LangChain
- FAISS (for vector storage)
- OpenAI & Google Gemini APIs
Language Models:
- Configurable to use either gpt-3.5-turbo (OpenAI) or gemini-pro (Google).

Setup and Installation

Prerequisites

Node.js and npm
Python 3.10+ and pip
An API key from either Google (for Gemini) or OpenAI.

Environment Variables

Create a .env file in the root of the project by copying the .env.example:

cp .env.example .env

Edit the .env file with your credentials and configuration:

# API Configuration
# Set the desired API provider: "GEMINI" or "OPENAI"
API_PROVIDER="GEMINI"

# API Keys
# Add your Google Gemini API key here
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"

# Add your OpenAI API key here
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"

Backend Setup

Navigate to the backend directory:
```
cd backend
```

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```
Start the backend server:
```
uvicorn app.main:app --reload
```
The API will be available at http://localhost:8000.

Frontend Setup

Navigate to the frontend directory:
```
cd frontend
```
Install the required packages:
```
npm install
```
Start the frontend development server:
```
npm run dev
```
The application will be available at http://localhost:5173.

Sample API Requests

Upload a Document

Endpoint: POST /api/upload

Form-Data with a file:

file: (binary content of a .txt, .md, or .pdf file)

Form-Data with text:

text: "This is the content of the document."

Ask a Question

Endpoint: POST /api/ask

Request Body (application/json):

{
  "question": "What is the main topic of the document?"
}

Response Body:

{
  "answer": "The main topic of the document is..."
}

Future Enhancements

Persistent Storage: Integrate a persistent vector database like Chroma or Milvus instead of the in-memory FAISS store.
User Authentication: Add user accounts to manage and isolate documents.
Chat History: Implement a feature to save and view the history of questions and answers for a given document.
Dockerization: Containerize the frontend and backend applications for easier deployment.
More File Types: Add support for more document formats like .docx, .pptx, etc.

Author

Priyank
GitHub: 13priyaank

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI-Powered Document Question Answering System

Project Overview

Architecture

Data Flow

Tech Stack

Setup and Installation

Prerequisites

Environment Variables

Backend Setup

Frontend Setup

Sample API Requests

Upload a Document

Ask a Question

Future Enhancements

Author

About

Uh oh!

Releases

Packages

Languages

License

13priyaank/param-tech-task

Folders and files

Latest commit

History

Repository files navigation

AI-Powered Document Question Answering System

Project Overview

Architecture

Data Flow

Tech Stack

Setup and Installation

Prerequisites

Environment Variables

Backend Setup

Frontend Setup

Sample API Requests

Upload a Document

Ask a Question

Future Enhancements

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages