|
| 1 | +# QuantumRAG — Open-Source RAG Engine That Actually Works |
| 2 | + |
| 3 | +> March 31, 2026 | v0.4.4 | Apache 2.0 |
| 4 | +
|
| 5 | +## TL;DR |
| 6 | + |
| 7 | +Put documents in. Ask questions. Get cited answers. **No configuration needed.** |
| 8 | + |
| 9 | +```python |
| 10 | +from quantumrag import Engine |
| 11 | + |
| 12 | +engine = Engine() |
| 13 | +engine.ingest("./docs") |
| 14 | +result = engine.query("What are the key findings?") |
| 15 | +print(result.answer) # ... [1][2] + Confidence: STRONGLY_SUPPORTED |
| 16 | +``` |
| 17 | + |
| 18 | +## The Problem |
| 19 | + |
| 20 | +Most RAG systems embed documents into vectors and hope the right chunks come back. When questions are phrased differently, when answers span multiple documents, or when you need exact entity matches — they fail silently. |
| 21 | + |
| 22 | +QuantumRAG takes a different approach: **understand documents deeply at indexing time, so queries are fast and accurate by default.** |
| 23 | + |
| 24 | +## How It Works |
| 25 | + |
| 26 | +### Triple Index Fusion |
| 27 | + |
| 28 | +Three retrieval methods combined via Score-Weighted Reciprocal Rank Fusion — each catches what the others miss: |
| 29 | + |
| 30 | +| Index | What It Finds | Why It Matters | |
| 31 | +|-------|--------------|----------------| |
| 32 | +| **Original Embedding** | Semantically similar content | Handles paraphrasing | |
| 33 | +| **HyPE Embedding** | Content that answers similar questions | Bridges question↔document gap | |
| 34 | +| **Contextual BM25** | Exact keyword and entity matches | Precise for names, numbers, IDs | |
| 35 | + |
| 36 | +### Hallucination Prevention |
| 37 | + |
| 38 | +Two-layer defense: |
| 39 | +1. **Fact Verifier** (Hard Gate): Cross-checks answers against facts extracted at ingest time. Zero LLM cost. |
| 40 | +2. **System Prompt** (Soft Gate): 11 generation rules enforce source-only answers with citations. |
| 41 | + |
| 42 | +### Adaptive Post-Correction |
| 43 | + |
| 44 | +After generation, an automatic correction pipeline runs within a 20-second time budget: |
| 45 | +- Retrieval Retry → Self-Correction → Fact Verification → Completeness Check |
| 46 | +- Simple queries skip correction entirely (zero overhead on happy path) |
| 47 | + |
| 48 | +## Korean as a First Language |
| 49 | + |
| 50 | +Not a translation — designed from the ground up: |
| 51 | + |
| 52 | +| Feature | Description | |
| 53 | +|---------|-------------| |
| 54 | +| HWP/HWPX Parsing | Native Korean government document support | |
| 55 | +| Kiwi Morphology | Accurate Korean tokenization for BM25 | |
| 56 | +| EUC-KR Detection | Automatic legacy encoding conversion | |
| 57 | +| Mixed Script | Optimal tokenizer for Korean-English mixed text | |
| 58 | +| Auto Language | Detects query language, responds in the same language | |
| 59 | + |
| 60 | +## Measured Performance |
| 61 | + |
| 62 | +105 QA questions across 73 source documents (including 50 noise documents): |
| 63 | + |
| 64 | +- **Combined QA: 75% pass rate** (improved from 29% through 6 measurement-driven iterations) |
| 65 | +- **Zero timeouts** |
| 66 | +- 176 scenario tests, 831 unit tests, mypy 0 errors |
| 67 | + |
| 68 | +## Zero Cost to Start |
| 69 | + |
| 70 | +- **Embedding**: Microsoft Harrier 270M (local, free, MTEB 66.5, 94 languages) |
| 71 | +- **LLM**: Gemini free tier (gemini-3.1-flash-lite-preview) |
| 72 | +- **Reranker**: FlashRank (CPU, free) |
| 73 | +- **No GPU required** |
| 74 | + |
| 75 | +## Try It in 30 Seconds |
| 76 | + |
| 77 | +```bash |
| 78 | +pip install quantumrag |
| 79 | +quantumrag demo |
| 80 | +# Open http://localhost:8000 |
| 81 | +``` |
| 82 | + |
| 83 | +Or with Docker: |
| 84 | + |
| 85 | +```bash |
| 86 | +docker run -e GOOGLE_API_KEY=AIza... -p 8000:8000 quantumrag |
| 87 | +``` |
| 88 | + |
| 89 | +## Three Ways to Use It |
| 90 | + |
| 91 | +| | What You Do | What Happens | |
| 92 | +|---|-------------|-------------| |
| 93 | +| **Just use it** | `engine.ingest("./docs")` → `engine.query("...")` | Parser, chunker, index, routing — all auto-selected | |
| 94 | +| **Tune it** | Adjust fusion weights, pick models, set domain | Better results for your specific use case | |
| 95 | +| **Own it** | Custom parsers, chunkers, retrievers, generators | Every layer is replaceable via plugins | |
| 96 | + |
| 97 | +## Supported Formats |
| 98 | + |
| 99 | +PDF, DOCX, PPTX, XLSX, HWP/HWPX, HTML, Markdown, CSV, TXT |
| 100 | + |
| 101 | +## Web Playground |
| 102 | + |
| 103 | +Built-in interactive UI at `http://localhost:8000`: |
| 104 | +- Upload documents (drag & drop) or paste text |
| 105 | +- Ask questions with streaming or detailed mode |
| 106 | +- Inspect pipeline trace with latency breakdown |
| 107 | +- View source citations with relevance scores |
| 108 | + |
| 109 | +## Comparison |
| 110 | + |
| 111 | +| Feature | QuantumRAG | LangChain | LlamaIndex | OpenAI file_search | |
| 112 | +|---------|:----------:|:---------:|:----------:|:------------------:| |
| 113 | +| Triple Index Fusion | Yes | No | No | No | |
| 114 | +| Hallucination Prevention | Yes | No | No | No | |
| 115 | +| Korean Native (HWP, Kiwi) | Yes | Plugin | Plugin | No | |
| 116 | +| Zero Config to First Answer | Yes | No | No | Partial | |
| 117 | +| Zero GPU Required | Yes | Depends | Depends | N/A | |
| 118 | +| Open Source | Apache 2.0 | MIT | MIT | No | |
| 119 | + |
| 120 | +## Links |
| 121 | + |
| 122 | +- **GitHub**: https://github.com/quantumaikr/quantumrag |
| 123 | +- **PyPI**: `pip install quantumrag` |
| 124 | +- **Docs**: [English](../../docs/en/index.md) | [한국어](../../docs/ko/index.md) |
| 125 | +- **License**: Apache 2.0 |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +*QuantumAI Inc. — hi@quantumai.kr* |
0 commit comments