Local AI Research Engine

Local AI Research Engine is an advanced RAG system that runs 100% locally on your machine. By combining semantic search with knowledge graphs, it helps researchers analyze complex document libraries without compromising privacy. Features include automated paper comparisons, contradiction detection, and publication-ready literature reviews

Features

Document Intelligence: Ingest PDFs, markdown, code, and text files
Knowledge Graph: Automatically extract entities and relationships
Hybrid Retrieval: Vector search + keyword search + graph traversal
Cited Answers: Every answer includes source citations
Multi-Document Reasoning: Synthesize information across multiple sources
Advanced Analysis: Paper comparisons, literature reviews, and contradiction detection
100% Local: Runs entirely on Ollama - no API calls, complete privacy

Prerequisites

Python 3.9+
Ollama installed and running

Required Models:

ollama pull mistral:latest
ollama pull nomic-embed-text

Installation

Clone or download this repository

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Copy .env.example to .env and configure if needed

Quick Start

Add documents to data/documents/
Run the Streamlit UI:
```
streamlit run ui/streamlit_app.py
```
Or use the CLI:
```
python main.py
```

Project Structure

local-research-engine/
├── ingest/          # Document loading and chunking
├── index/           # Vector store, keyword index, knowledge graph
├── retrieval/       # Hybrid search and reranking
├── llm/             # Ollama client and prompts
├── ui/              # Streamlit interface
├── data/            # Documents and indexes
└── main.py          # CLI interface

Usage

Upload Documents

Place your documents in data/documents/ or use the Streamlit upload interface.

Ask Questions

The system will:

Retrieve relevant chunks using hybrid search
Expand context using the knowledge graph
Rerank evidence with LLM
Generate an answer with inline citations

Example Query

Q: "What is the EM algorithm used for?"

A: The EM algorithm is used to estimate HMM parameters by iteratively maximizing the expected log-likelihood [Rabiner1989.pdf §4]. Unlike gradient-based methods, EM guarantees non-decreasing likelihood [Bishop.pdf §9.2].

Configuration

Edit config.yaml to customize:

Chunk sizes
Retrieval parameters
Model selection
Storage paths

License

MIT License - See LICENSE file for details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local AI Research Engine

Features

Prerequisites

Installation

Quick Start

Project Structure

Usage

Upload Documents

Ask Questions

Example Query

Configuration

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
advanced		advanced
data/documents		data/documents
index		index
ingest		ingest
llm		llm
retrieval		retrieval
test_documents		test_documents
tests		tests
ui		ui
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

aadhamashraf/Local-AI-Research-Engine

Folders and files

Latest commit

History

Repository files navigation

Local AI Research Engine

Features

Prerequisites

Installation

Quick Start

Project Structure

Usage

Upload Documents

Ask Questions

Example Query

Configuration

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages