Skip to content

13priyaank/param-tech-task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Powered Document Question Answering System

Project Overview

This project is a full-stack web application that allows users to upload text-based content (from .txt, .md, or .pdf files, or direct text input) and ask natural language questions about it. The system uses a Generative AI model (configurable to be either Gemini or OpenAI) to generate answers that are strictly grounded in the uploaded content, thereby reducing the likelihood of hallucinations. This is achieved through Retrieval-Augmented Generation (RAG).

Architecture

The application is built with a classic client-server architecture.

  • Frontend: A single-page application (SPA) built with React. It provides the user interface for uploading documents and asking questions.
  • Backend: A RESTful API built with Python and FastAPI. It handles the core logic, including document processing, embedding generation, and interaction with the AI models.

Data Flow

  1. Document Upload: The user uploads a document or pastes text via the React frontend. The content is sent to the /api/upload endpoint on the FastAPI backend.
  2. Processing & Embedding: The backend receives the content, splits it into manageable chunks, and uses an embedding model (from either Gemini or OpenAI) to convert each chunk into a vector embedding.
  3. Vector Storage: These embeddings are stored in-memory in a FAISS (Facebook AI Similarity Search) vector store for efficient similarity searching.
  4. Question Answering: The user submits a question through the frontend, which is sent to the /api/ask endpoint.
  5. Context Retrieval: The backend embeds the user's question and uses the FAISS vector store to find the most relevant text chunks from the original document (similarity search).
  6. Prompt Construction: The retrieved text chunks (context) and the user's question are combined into a single prompt for the language model. This prompt engineering guides the model to answer based on the provided text.
  7. AI Model Call: The backend sends the constructed prompt to the configured AI model (Gemini or OpenAI).
  8. Response Generation: The AI model generates an answer based on the context and question, which is then sent back to the frontend to be displayed to the user.

Tech Stack

  • Frontend:
    • React (v18)
    • Vite
    • Axios
    • CSS3
  • Backend:
    • Python 3.10+
    • FastAPI
    • Pydantic
    • Uvicorn
    • LangChain
    • FAISS (for vector storage)
    • OpenAI & Google Gemini APIs
  • Language Models:
    • Configurable to use either gpt-3.5-turbo (OpenAI) or gemini-pro (Google).

Setup and Installation

Prerequisites

  • Node.js and npm
  • Python 3.10+ and pip
  • An API key from either Google (for Gemini) or OpenAI.

Environment Variables

Create a .env file in the root of the project by copying the .env.example:

cp .env.example .env

Edit the .env file with your credentials and configuration:

# API Configuration
# Set the desired API provider: "GEMINI" or "OPENAI"
API_PROVIDER="GEMINI"

# API Keys
# Add your Google Gemini API key here
GOOGLE_API_KEY="YOUR_GOOGLE_API_KEY"

# Add your OpenAI API key here
OPENAI_API_KEY="YOUR_OPENAI_API_KEY"

Backend Setup

  1. Navigate to the backend directory:
    cd backend
  2. Create a virtual environment:
    python -m venv venv
    source venv/bin/activate  # On Windows use `venv\Scripts\activate`
  3. Install the required packages:
    pip install -r requirements.txt
  4. Start the backend server:
    uvicorn app.main:app --reload
    The API will be available at http://localhost:8000.

Frontend Setup

  1. Navigate to the frontend directory:
    cd frontend
  2. Install the required packages:
    npm install
  3. Start the frontend development server:
    npm run dev
    The application will be available at http://localhost:5173.

Sample API Requests

Upload a Document

Endpoint: POST /api/upload

Form-Data with a file:

file: (binary content of a .txt, .md, or .pdf file)

Form-Data with text:

text: "This is the content of the document."

Ask a Question

Endpoint: POST /api/ask

Request Body (application/json):

{
  "question": "What is the main topic of the document?"
}

Response Body:

{
  "answer": "The main topic of the document is..."
}

Future Enhancements

  • Persistent Storage: Integrate a persistent vector database like Chroma or Milvus instead of the in-memory FAISS store.
  • User Authentication: Add user accounts to manage and isolate documents.
  • Chat History: Implement a feature to save and view the history of questions and answers for a given document.
  • Dockerization: Containerize the frontend and backend applications for easier deployment.
  • More File Types: Add support for more document formats like .docx, .pptx, etc.

Author

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published