Skip to content

saiprasadchary/ModularKA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modular Knowledge Assistant

Modular Knowledge Assistant is a Streamlit application for exploring research papers with retrieval-augmented generation. It can parse a PDF or open-access paper link, build a local vector index, answer questions grounded in the paper, summarize the content, and generate code-oriented explanations from relevant sections.

Features

  • Upload a PDF or provide an arXiv/open-access URL or DOI.
  • Parse and chunk paper text for retrieval.
  • Build a local Chroma vector store with sentence-transformer embeddings.
  • Ask paper-specific questions through a RAG assistant.
  • Choose between response styles: empathetic or strictly objective.
  • Generate summaries tailored to the selected user background.
  • Request code snippets or implementation guidance based on paper context.
  • Switch between Groq-hosted models and local Ollama models.

Project Structure

.
|-- app.py                  # Main Streamlit application
|-- core/                   # Parsing, retrieval, LLM setup, and agent chains
|-- ui/                     # Sidebar components, views, and branding
|-- scripts/                # Utility scripts
|-- requirements.txt        # Python dependencies
|-- .env.example            # Safe configuration template
`-- .gitignore              # Local secrets/cache exclusions

Requirements

  • Python 3.11 or newer is recommended.
  • A Groq API key for hosted model usage.
  • Optional: Ollama installed locally if you want to run local models.

Setup

  1. Create and activate a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create your local environment file:
cp .env.example .env
  1. Fill in .env with your own values:
GROQ_API_KEY=your_real_key_here
UNPAYWALL_EMAIL=your_email@example.com
USE_OLLAMA=0

Do not commit .env. It is intentionally ignored because it can contain API keys and local machine settings.

Running the App

streamlit run app.py

Then open the Streamlit URL shown in your terminal, upload a paper or enter a supported source, and click Analyze Paper.

Configuration

The app reads configuration from .env using python-dotenv.

Variable Purpose
GROQ_API_KEY API key used for Groq-backed LLM calls.
UNPAYWALL_EMAIL Email used when resolving DOI/open-access metadata.
GROQ_GENERAL_MODEL General-purpose Groq model for answers and summaries.
GROQ_CODE_MODEL Groq model used for code generation.
PERSIST_DIRECTORY Local directory for vector-store persistence.
USE_OLLAMA Set to 1 to prefer local Ollama models, or 0 for Groq.
OLLAMA_BASE_URL Local Ollama server URL.
OLLAMA_GENERAL_MODEL Ollama model for general responses.
OLLAMA_CODER_MODEL Ollama model for code-oriented responses.

Security Notes

  • Keep real secrets only in .env or your deployment platform's secret manager.
  • .env, virtual environments, generated bytecode, and local vector storage are ignored by git.
  • .env.example should contain placeholders only.
  • If an API key was ever committed previously, rotate it in the provider dashboard before relying on it again.

Development

Before publishing changes, run a quick syntax check:

python3 -m py_compile app.py core/*.py ui/*.py

For local experiments, prefer changing .env instead of hard-coding credentials or model choices into source files.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages