Awesome Private AI

Curated list of tools, frameworks, and resources for running, building, and deploying AI privately — on-prem, air-gapped, or self-hosted.

Private AI enables you to keep your data, models, and infrastructure under your control, avoiding unnecessary exposure to third parties. This list covers inference runtimes, model management, privacy tools, and more.

Inference Runtimes & Backends

Engines and frameworks to run LLMs, vision, and multimodal models locally.

vLLM - High-throughput, low-latency inference engine for LLMs.
mlx-lm - Fast, Apple Silicon-optimized LLM inference engine for running models locally and privately.
Jan - Privacy-first, offline AI assistant and LLM runtime for local, secure inference.
LM Studio - Cross-platform desktop app for running local LLMs with an easy-to-use interface.
LLM-D - Privacy-first, distributed LLM inference engine for scalable, local deployments.
Ollama - Local LLM runner with model packaging.
llama.cpp - Portable, CPU/GPU-friendly LLaMA inference.
text-generation-inference - Optimized serving stack from Hugging Face.
GPT4All - Local desktop model runner.

Model Management & Serving

Tools for hosting, scaling, and versioning AI models privately.

Ray Serve - Scalable Python model serving.
Seldon Core - Kubernetes-native model deployment.
KServe - Serverless model inference on Kubernetes.
BentoML - Model packaging & serving framework.
vLLM Production Stack - End-to-end stack for deploying vLLM in production, including orchestration, monitoring, autoscaling, and best practices for private LLM serving.
OME (Open Model Engine) - Unified, open-source engine for serving, managing, and scaling LLMs and multimodal models privately. Supports sglang, vLLM, and more.

Fine-Tuning & Adapters

Private workflows for adapting models to your needs.

LoRA - Low-rank adaptation technique.
PEFT - Parameter-efficient fine-tuning.
QLoRA - Memory-efficient LoRA on quantized models.

Vector Databases & Embeddings

Private semantic search & retrieval-augmented generation.

Milvus - Scalable vector database.
Weaviate - Open-source semantic search engine.
Chroma - Local-first vector database.
FAISS - Facebook AI Similarity Search.

Agents & Orchestration

Frameworks for chaining private AI tools & agents.

LangChain - Agent and LLM orchestration framework.
Haystack - End-to-end RAG pipelines.
Flowise - No-code LangChain UI.
LlamaIndex - Data framework for LLM apps.
Trae Agent - Privacy-friendly agent framework for orchestrating LLMs and tools, designed for secure, local, and scalable AI workflows.
Qwen-Agent - Open-source, privacy-friendly agent framework for orchestrating LLMs and tools, designed for secure, local, and scalable AI workflows.
Crush - Privacy-first, open-source agentic coding and automation platform for local AI workflows.
OpenCode AI - Open-source agentic coding platform for private, local, and secure AI-powered development workflows.

VS Code Plugins & Extensions

Privacy-first, open-source agentic coding plugins and extensions for VS Code and other editors.

Roo Code - Privacy-first, open-source agentic coding platform for secure, local AI development (VS Code extension).
cline - Privacy-first, open-source agentic coding platform for local AI workflows and automation (VS Code extension).

Privacy, Security & Governance

Keep AI deployments secure and compliant.

BlindAI - Confidential AI inference using TEEs.
OpenFL - Federated learning framework.
Flower - Federated learning at scale.
Concrete - Fully homomorphic encryption for AI.

Models for Private Deployment

Open-weight models and model libraries you can self-host.

LLaMA - Meta’s open-weight language models.
Mistral - Open source models by Mistral AI.
Phi - Small, high-quality models from Microsoft.
Mixtral - Mixture-of-experts model.
Falcon - Open-source model from TII.
MLX Community - Community-driven Hugging Face page for open MLX models, optimized for Apple Silicon and private deployment.

UI & Interaction Layers

Self-hosted chat & AI frontends.

Chatbot UI - Open-source ChatGPT clone.
LibreChat - Enhanced web UI for LLMs.
AnythingLLM - Full-stack private LLM workspace.

Datasets & Data Prep

Create and manage private training corpora.

OpenWebText - Open dataset similar to GPT training data.
RedPajama - Open LLM training dataset.
Datamixers - Privacy-focused data preprocessing tools.

Learning Resources & Research

Guides, papers, and tutorials on private AI.

#TODO

AI Routers & API Aggregators

Centralized routers and proxy layers for aggregating, governing, and securing your private AI stack. These tools simplify connections to multiple model servers, optimize LLM routing, and provide observability, security, and compliance.

Nexus - Open-source AI router to aggregate Model Context Protocol (MCP) servers, intelligently route requests to the best LLMs, and provide security, governance, observability, and simplified architecture for private AI deployments. Blog

Contributing

Contributions welcome! Provide a pull request. You can suggest a new software or section.

License

This list is under the CC BY-SA 4.0. Terms of the license are summarized here.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Private AI

Contents

Inference Runtimes & Backends

Model Management & Serving

Fine-Tuning & Adapters

Vector Databases & Embeddings

Agents & Orchestration

VS Code Plugins & Extensions

Privacy, Security & Governance

Models for Private Deployment

UI & Interaction Layers

Datasets & Data Prep

Learning Resources & Research

AI Routers & API Aggregators

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome Private AI

Contents

Inference Runtimes & Backends

Model Management & Serving

Fine-Tuning & Adapters

Vector Databases & Embeddings

Agents & Orchestration

VS Code Plugins & Extensions

Privacy, Security & Governance

Models for Private Deployment

UI & Interaction Layers

Datasets & Data Prep

Learning Resources & Research

AI Routers & API Aggregators

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages