Skip to content

Commit b2d6a0d

Browse files
docs: Complete documentation overhaul for v1.0.0
- **Gemini Recommended**: Updated all guides to recommend Gemini (free tier, faster) over OpenAI. - **Reference Docs Updated**: Rewrote embeddings.md, api.md, databases.md, and architecture.md for 100% consistency. - **Environment Variables**: Aligned all documentation with settings.py (removed obsolete AstraDB keys). - **Benchmarks**: Added performance data to recommendation boxes.
1 parent 5acf0e0 commit b2d6a0d

File tree

10 files changed

+309
-582
lines changed

10 files changed

+309
-582
lines changed

README.md

Lines changed: 61 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -93,23 +93,27 @@ pip install crossvector
9393
### With Specific Backends
9494

9595
```bash
96-
# AstraDB + OpenAI
97-
pip install crossvector[astradb,openai]
96+
# Recommended: PgVector + Gemini (free tier)
97+
pip install crossvector[pgvector,gemini]
98+
99+
# Alternative: ChromaDB + Gemini (cloud or local)
100+
pip install crossvector[chromadb,gemini]
98101

99-
# ChromaDB + OpenAI
102+
# With OpenAI (requires paid API key)
103+
pip install crossvector[pgvector,openai]
100104
pip install crossvector[chromadb,openai]
101105

102106
# Milvus + Gemini
103107
pip install crossvector[milvus,gemini]
104108

105-
# PgVector + OpenAI
106-
pip install crossvector[pgvector,openai]
109+
# AstraDB + OpenAI
110+
pip install crossvector[astradb,openai]
107111
```
108112

109-
### All Backends and Providers
113+
### All Backends
110114

111115
```bash
112-
# Everything
116+
# Install everything
113117
pip install crossvector[all]
114118

115119
# All databases only
@@ -123,21 +127,27 @@ pip install crossvector[astradb,all-embeddings]
123127

124128
## Quick Start
125129

130+
> 💡 **Recommended**: Use `GeminiEmbeddingAdapter` for most use cases - free tier, faster search (1.5x), smaller vectors (768 vs 1536 dims). See [benchmarks](benchmark.md) for details.
131+
126132
### Basic Usage
127133

128134
```python
129135
from crossvector import VectorEngine
130-
from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
136+
from crossvector.embeddings.gemini import GeminiEmbeddingAdapter
131137
from crossvector.dbs.pgvector import PgVectorAdapter
132138

133-
# Initialize engine (uses default models if not specified)
139+
# Initialize engine with Gemini (recommended: free tier, fast performance)
134140
engine = VectorEngine(
135-
embedding=OpenAIEmbeddingAdapter(), # Uses text-embedding-3-small by default
141+
embedding=GeminiEmbeddingAdapter(), # Free tier, 1536-dim vectors
136142
db=PgVectorAdapter(),
137143
collection_name="my_documents",
138144
store_text=True
139145
)
140146

147+
# Alternative: OpenAI (requires paid API key, 1536-dim vectors)
148+
# from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
149+
# embedding = OpenAIEmbeddingAdapter()
150+
141151
# Create documents (flexible input formats)
142152
doc1 = engine.create(text="Python is a programming language")
143153
doc2 = engine.create({"text": "Artificial intelligence", "metadata": {"category": "tech"}})
@@ -452,8 +462,8 @@ Different backends have varying feature support:
452462
| Feature | AstraDB | ChromaDB | Milvus | PgVector |
453463
|---------|---------|----------|--------|----------|
454464
| Vector Search |||||
455-
| Metadata-Only Search ||| ||
456-
| Nested Metadata ||* | ||
465+
| Metadata-Only Search ||| ||
466+
| Nested Metadata ||| ||
457467
| Numeric Comparisons |||||
458468
| Text Storage |||||
459469

@@ -537,50 +547,63 @@ engine = VectorEngine(embedding=embedding, db=db)
537547

538548
## Embedding Providers
539549

540-
### OpenAI
550+
> 💡 **Recommended**: Start with **Gemini** for free tier and faster performance. See [benchmark comparison](benchmark.md).
551+
552+
### Gemini (Recommended)
541553

542554
```python
543-
from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
555+
from crossvector.embeddings.gemini import GeminiEmbeddingAdapter
544556

545-
# Default model (text-embedding-3-small, 1536 dims)
546-
embedding = OpenAIEmbeddingAdapter()
557+
# Default model (gemini-embedding-001, 1536 dims)
558+
embedding = GeminiEmbeddingAdapter()
547559

548-
# Or use VECTOR_EMBEDDING_MODEL from .env
549-
# VECTOR_EMBEDDING_MODEL=text-embedding-3-large
550-
embedding = OpenAIEmbeddingAdapter() # Uses env var
560+
# Explicit model specification
561+
embedding = GeminiEmbeddingAdapter(model_name="models/text-embedding-004", dim=768)
562+
```
551563

552-
# Explicit model override
553-
embedding = OpenAIEmbeddingAdapter(model_name="text-embedding-3-large")
564+
**Why Choose Gemini:**
565+
-**Free tier**: 1,500 requests/min (vs OpenAI paid only)
566+
-**Faster search**: 234ms avg (1.5x faster than OpenAI)
567+
-**Efficient**: 768 dims = 50% less storage than OpenAI
568+
-**Quality**: Comparable accuracy to OpenAI
569+
570+
**Configuration:**
571+
```bash
572+
GEMINI_API_KEY=AI... # Get free key at https://makersuite.google.com/app/apikey
554573
```
555574

556575
**Supported Models:**
557-
- `text-embedding-3-small` (1536 dims, default)
558-
- `text-embedding-3-large` (3072 dims)
559-
- `text-embedding-ada-002` (1536 dims, legacy)
576+
- `gemini-embedding-001` (1536 dims, **recommended**)
577+
- `models/text-embedding-004` (768 dims)
560578

561-
### Gemini
579+
### OpenAI (Alternative)
562580

563581
```python
564-
from crossvector.embeddings.gemini import GeminiEmbeddingAdapter
582+
from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
565583

566-
# Default model (gemini-embedding-001, 1536 dims)
567-
embedding = GeminiEmbeddingAdapter()
584+
# Default model (text-embedding-3-small, 1536 dims)
585+
embedding = OpenAIEmbeddingAdapter()
568586

569-
# Or use VECTOR_EMBEDDING_MODEL from .env
570-
# VECTOR_EMBEDDING_MODEL=gemini-embedding-001
571-
embedding = GeminiEmbeddingAdapter() # Uses env var
587+
# Explicit model specification
588+
embedding = OpenAIEmbeddingAdapter(model_name="text-embedding-3-large")
589+
```
572590

573-
# With custom dimensions (768, 1536, 3072)
574-
embedding = GeminiEmbeddingAdapter(dim=768)
591+
**When to Use OpenAI:**
592+
- ✅ Need 1536 or 3072 dimensions
593+
- ✅ Already have OpenAI API budget
594+
- ✅ Prefer OpenAI ecosystem integration
575595

576-
# With task type
577-
embedding = GeminiEmbeddingAdapter(
578-
task_type="retrieval_document" # or "retrieval_query", "semantic_similarity"
579-
)
596+
**Configuration:**
597+
```bash
598+
OPENAI_API_KEY=sk-... # Paid API key from https://platform.openai.com
580599
```
581600

582601
**Supported Models:**
583-
- `gemini-embedding-001` (768-3072 dims, default, recommended)
602+
- `text-embedding-3-small` (1536 dims, default)
603+
- `text-embedding-3-large` (3072 dims)
604+
- `text-embedding-ada-002` (1536 dims, legacy)
605+
606+
- `gemini-embedding-001` (1536 dims, default)
584607
- `text-embedding-005` (768 dims)
585608
- `text-embedding-004` (768 dims, legacy)
586609

docs/adapters/databases.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ pip install crossvector[astradb]
4040
```bash
4141
ASTRA_DB_APPLICATION_TOKEN="AstraCS:xxx"
4242
ASTRA_DB_API_ENDPOINT="https://xxx.apps.astra.datastax.com"
43-
ASTRA_DB_KEYSPACE="default_keyspace" # Optional
43+
# Note: Collection name uses VECTOR_COLLECTION_NAME (shared setting)
4444
```
4545

4646
**Programmatic:**
@@ -727,11 +727,11 @@ Same code works across all backends:
727727

728728
```python
729729
from crossvector import VectorEngine
730-
from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
730+
from crossvector.embeddings.gemini import GeminiEmbeddingAdapter
731731
from crossvector.querydsl.q import Q
732732

733733
# Create embedding adapter (same for all)
734-
embedding = OpenAIEmbeddingAdapter()
734+
embedding = GeminiEmbeddingAdapter()
735735

736736
# Choose backend (interchangeable)
737737
if backend == "astradb":

0 commit comments

Comments
 (0)