Skip to content

πŸ€– Chat with your documents using Large Language Models (LLMs), FAISS, and Hugging Face Transformers β€” all running locally with Streamlit.

License

Notifications You must be signed in to change notification settings

ramarav/DocuMind-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 DocuMind LLM β€” Intelligent Document Q&A Assistant

Python Hugging Face FAISS LangChain Streamlit License: MIT Status Contributions Welcome Stars Forks Last Commit


DocuMind LLM is a Generative AI-powered document assistant built with Hugging Face Transformers, FAISS, and LangChain.
It allows users to upload PDF files, intelligently index their contents, and ask natural language questions about the document.


πŸš€ Features

  • πŸ“„ PDF Upload & Parsing β€” Extracts text and chunks it for semantic understanding.
  • πŸ€– LLM-powered Q&A β€” Uses a Transformer model (e.g., mistralai/Mixtral, google/flan-t5, etc.) to answer questions.
  • ⚑ FAISS-based Vector Search β€” Enables fast and accurate document retrieval.
  • πŸ’¬ Conversational Memory β€” Keeps track of your recent queries for context-aware responses.
  • 🧩 Modular Architecture β€” Easy to extend with other models, vector stores, or APIs.

🧰 Tech Stack

Component Technology
Embeddings Hugging Face Sentence Transformers
Vector Store FAISS
LLM Hugging Face Transformers
Interface Streamlit / Flask
Backend Python 3.10+

βš™οΈ Installation

git clone https://github.com/ramarav/DocuMind-LLM.git
cd DocuMind-LLM
pip install -r requirements.txt

🧠 Usage

1️⃣ Start the App

python app.py

2️⃣ Upload a PDF file

Choose any .pdf document you want to query.

3️⃣ Ask Questions

Type natural language questions like:

β€œWhat are the main topics covered in this document?”
β€œSummarize section 3.”
β€œWhat are the key takeaways?”


πŸ“š Example Use Cases

  • Research paper summarization
  • Legal contract question answering
  • Technical documentation assistant
  • Corporate report analysis
  • AI-based knowledge discovery

πŸ§‘β€πŸ’» Folder Structure

DocuMind-LLM/
β”‚
β”œβ”€β”€ app.py                # Main entry point
β”œβ”€β”€ utils/                # Helper scripts
β”‚   β”œβ”€β”€ pdf_loader.py
β”‚   β”œβ”€β”€ embedder.py
β”‚   β”œβ”€β”€ vector_store.py
β”‚   └── qa_engine.py
β”œβ”€β”€ sample.pdf            # Example document
β”œβ”€β”€ requirements.txt
└── README.md

πŸͺ„ Future Enhancements

  • Add chat history memory using LangChain.
  • Integrate OpenAI API for comparison.
  • Enable multi-file document search.
  • Add semantic summarization features.

πŸ† Credits

Developed by Mekala Ramarao
πŸ’Ό AMD India | AI/ML Engineer
πŸ”— LinkedIn β€’ GitHub

πŸ“œ License

This project is licensed under the MIT License.

About

πŸ€– Chat with your documents using Large Language Models (LLMs), FAISS, and Hugging Face Transformers β€” all running locally with Streamlit.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages