Text-to-SQL Agent Pipeline

Build LLM agents that convert natural language to SQL queries using RAG and safety validation.

Features

🤖 LLM-powered text-to-SQL generation
🔒 SQL safety validation
🔍 RAG-based context retrieval
✅ Type-safe with Pydantic
🧪 Comprehensive test suite

Setup

# 1. Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# 2. Configure and setup
cp .env.example .env
# Edit .env to set OLLAMA_MODEL=lfm2 (or your preferred model)

bash scripts/setup.sh  # Installs deps, pulls models, creates DB

# 3. Start Ollama (terminal 1)
uv run scripts/serve_ollama.py

# 4. Run tutorials (terminal 2)
python tutorials/01_data_collection.py

Tutorials

7 progressive tutorials teaching LLM agent development:

Data Collection - Create biomedical database
Schema Discovery - LLM-powered schema analysis
Pydantic Validation - Type-safe structured outputs
Safety Validation - SQL query security
RAG Indexing - Vector store for context retrieval
Text-to-SQL - Generate queries with RAG
Agentic Pipeline - LangGraph orchestration

python tutorials/01_data_collection.py
python tutorials/02_schema_discovery.py
# ... continue through 07

See tutorials/README.md for details.

Usage Example

from src.utils.llm import LLMClient
from src.utils.validation import SQLValidator
from src.models.schemas import SQLQuery

llm = LLMClient()
query = llm.generate_structured(
    prompt="Count patients with diabetes",
    response_model=SQLQuery
)

validator = SQLValidator()
if validator.validate(query.query).is_valid:
    print(f"Safe: {query.query}")

Testing

pytest                    # Run all tests
pytest --cov=src          # With coverage

Troubleshooting

Ollama connection error:

uv run scripts/check_ollama.py
uv run scripts/serve_ollama.py

Model not found:

cat .env | grep OLLAMA
uv run scripts/check_ollama.py

Database not found:

python tutorials/01_data_collection.py

Architecture

User Question → LangGraph Pipeline
                ↓
            RAG + LLM → Generate SQL
                ↓
            Validate → Safety checks
                ↓
            Execute → SQLite
                ↓
            Results

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
docs		docs
scripts		scripts
src		src
tests		tests
tutorials		tutorials
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-to-SQL Agent Pipeline

Features

Setup

Tutorials

Usage Example

Testing

Troubleshooting

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Text-to-SQL Agent Pipeline

Features

Setup

Tutorials

Usage Example

Testing

Troubleshooting

Architecture

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages