🪁 Kite

From idea to running AI agent in one command.

pip install kite-agent
kite generate "customer support agent that tracks orders"

A running, multi-agent Python script. No boilerplate. No config files.

Why Kite?

LangChain gives you 500+ abstractions. AutoGen needs 100 lines of config.
Kite gives you one command — and a different philosophy.

	LangChain	AutoGen	Kite
Time to first agent	~30 min	~20 min	< 1 min
LLM as untrusted component	❌	❌	✅
Built-in circuit breaker	❌	❌	✅
Kill switch	❌	❌	✅
Prompt A/B testing	❌	❌	✅
CLI code generation	❌	❌	✅
Startup time	~2s	~1s	~50ms

The core idea: LLMs don't execute. They propose.

Most frameworks let the LLM call tools directly. Kite doesn't.

User request
    │
    ▼
LLM (untrusted) ── proposes ──▶  Kernel (you control)
                                       │
                           ┌───────────┴──────────┐
                           │  tool whitelisted?    │
                           │  budget exceeded?     │
                           │  policy violated?     │
                           └───────────┬──────────┘
                                  approved?
                               YES ↙     ↘ NO
                           Execute      Reject + log

# ❌ Other frameworks: LLM decides what runs
agent.run("delete all test users")  # LLM calls delete_user() directly

# ✅ Kite: LLM proposes, Kernel validates
shell = ShellTool(allowed_commands=["ls", "git", "df"])
# agent.run("rm -rf /") → blocked at kernel, never executes

Read the full architecture →

30-second quickstart

pip install kite-agent
export GROQ_API_KEY=your_key    # free at console.groq.com
kite generate "research assistant that searches and summarizes" --out agent.py
python agent.py

Or scaffold a full project:

kite init --type=agent --name=my_bot
cd my_bot && cp .env.example .env
python main.py

Production safety — built in, not bolted on

from kite import Kite

ai = Kite()

# Circuit breaker — auto-stops cascading failures
ai.circuit_breaker.config.failure_threshold = 3
ai.circuit_breaker.config.timeout_seconds = 60

# Idempotency — no duplicate charges, no double-sends
result = ai.idempotency.execute(
    operation_id="order_123_refund",   # same id = cached result
    func=process_refund,
    args=(order_id,)
)

# Kill switch — emergency stop, per-agent or global
ai.kill_switch.activate("Budget limit reached")
agent.kill_switch.activate("This agent only")

5 reasoning patterns

agent = ai.create_agent(name="Bot", agent_type="react", ...)        # think→act→observe loop
agent = ai.create_agent(name="Bot", agent_type="rewoo", ...)        # plan upfront, run parallel (~2× faster)
agent = ai.create_agent(name="Bot", agent_type="tot", ...)          # explore multiple paths
agent = ai.create_agent(name="Bot", agent_type="plan_execute", ...) # decompose, replan on failure
agent = ai.create_agent(name="Bot", agent_type="reflective", ...)   # generate → critique → improve

Advanced RAG — production retrieval, not toy examples

# Load any document type
ai.load_document("docs/policy.pdf")   # PDF, DOCX, CSV, HTML, TXT
ai.load_document("data/")             # entire directory

# HyDE — generate hypothetical answer first, then search (↑ accuracy)
results = ai.advanced_rag.search("return policy", method="hyde")

# Hybrid search — BM25 keyword + vector semantic combined
results = ai.advanced_rag.hybrid_search("cancellation steps", alpha=0.5)

# MMR — remove redundant results, maximize diversity
results = ai.advanced_rag.mmr("pricing tiers", results, lambda_param=0.7)

# Reranking — Cohere or Cross-encoder for final precision
results = ai.advanced_rag.rerank_cohere("refund eligibility", results)

# Knowledge graph — multi-hop relationship queries
ai.graph_rag.add_relationship("Order", "belongs_to", "Customer")
answer = ai.graph_rag.query("Which orders belong to premium customers?")

Prompt A/B testing

Test prompts and models on real traffic. No other Python agent framework ships this.

from kite.ab_testing import ABTestManager

ab = ABTestManager()
ab.create_experiment(
    name="support_tone",
    variants=[
        {"name": "formal", "weight": 0.5, "config": {"system_prompt": "You are professional..."}},
        {"name": "casual", "weight": 0.5, "config": {"system_prompt": "Hey! Happy to help..."}},
    ]
)

variant = ab.get_variant("support_tone", user_id="user_123")  # consistent per user
ab.record_conversion("support_tone", variant.name)

results = ab.get_results("support_tone")
# → {"winner": "casual", "confidence": 0.94, "conversions": {...}}

Multi-agent conversation

researcher = ai.create_agent("Researcher", "You gather facts...",         agent_type="react")
critic     = ai.create_agent("Critic",     "You challenge assumptions...")
writer     = ai.create_agent("Writer",     "You synthesize into prose...")

conversation = ai.create_conversation(
    agents=[researcher, critic, writer],
    max_turns=9,
    termination_condition="consensus"
)

result = await conversation.run("Best pricing strategy for B2B SaaS?")

Smart model routing — cut costs 60–80%

# .env
FAST_LLM_MODEL=groq/llama-3.1-8b-instant   # routing, simple tasks
SMART_LLM_MODEL=openai/gpt-4o              # complex reasoning

from kite.optimization.resource_router import ResourceAwareRouter

router   = ResourceAwareRouter(ai.config)
router_a = ai.create_agent("Router",  model=router.fast_model,  ...)
analyst  = ai.create_agent("Analyst", model=router.smart_model, ...)

Human-in-the-loop workflows

pipeline = ai.pipeline.create("approval_flow")
pipeline.add_step("draft",  draft_email)
pipeline.add_checkpoint("draft")    # ← pauses for human review
pipeline.add_step("send",   send_email)

state = await pipeline.execute_async({"to": "customer@example.com"})
final = await pipeline.resume_async(state.task_id, approved=True)

Observability

ai.enable_tracing("run_trace.json")              # every event → JSON file
ai.enable_state_tracking("session.json")         # state changes across session
ai.event_bus.subscribe("agent:*", my_callback)   # subscribe to any event
ai.add_event_relay("http://localhost:8000/events") # forward to dashboard
print(ai.get_metrics())   # circuit breaker, cache hits, token usage

Works with any LLM

LLM_PROVIDER=groq       LLM_MODEL=llama-3.3-70b-versatile   # fastest, free tier
LLM_PROVIDER=openai     LLM_MODEL=gpt-4o                    # most capable
LLM_PROVIDER=anthropic  LLM_MODEL=claude-3-5-sonnet-...     # best reasoning
LLM_PROVIDER=ollama     LLM_MODEL=qwen2.5:1.5b              # local, free

Switch by changing 2 env vars. Zero code changes.

MCP integrations

from kite.tools.mcp.slack_mcp_server    import SlackMCPServer
from kite.tools.mcp.gmail_mcp_server    import GmailMCPServer
from kite.tools.mcp.gdrive_mcp_server   import GDriveMCPServer
from kite.tools.mcp.postgres_mcp_server import PostgresMCPServer
from kite.tools.mcp.stripe_mcp_server   import StripeMCPServer  # idempotency keys built-in

CLI reference

Command	What it does
`kite generate "idea" --out app.py`	Generate multi-agent app from natural language
`kite compile skill.md --out app.py`	Compile a Markdown skill spec into Python
`kite init --type=agent --name=bot`	Scaffold a new agent project
`kite init --type=workflow --name=w`	Scaffold a multi-agent pipeline
`kite init --type=tool --name=t`	Scaffold a standalone tool module

Examples

Example	What it builds	Difficulty
Case 1	E-commerce support bot	🟢 Beginner
Case 2	Data analyst with SQL + charts	🟡 Intermediate
Case 3	Deep research + web scraping	🟡 Intermediate
Case 4	Multi-agent collaboration + HITL	🔴 Advanced
Case 5	DevOps automation with safe shell	🟡 Intermediate
Case 6	ReAct vs ReWOO vs ToT benchmark	🔴 Advanced

Architecture

kite/
├── agents/      # ReAct, ReWOO, ToT, Plan-Execute, Reflective
├── memory/      # Vector RAG, Advanced RAG (HyDE/hybrid/MMR), Graph RAG, Session, Semantic Cache
├── safety/      # Circuit breaker, Kill switch, Idempotency, Guardrails
├── routing/     # LLM router, Semantic router, Aggregator, Resource-aware
├── tools/       # Web search, Calculator, Shell (whitelisted), MCP servers
├── pipeline/    # Deterministic workflows with HITL checkpoints
├── ab_testing/  # Prompt & model A/B experiments
├── monitoring/  # Metrics, tracing, event bus, FastAPI dashboard
└── utils/       # Batch processor, Cluster (Redis), Document loader

Lazy-loaded. Kite() starts in ~50ms.

Roadmap

Contributing

git clone https://github.com/thienzz/Kite
cd Kite && pip install -e ".[dev]"
pytest tests/

See CONTRIBUTING.md for guidelines.

License

MIT — use however you want. Commercial use welcome.

⭐ Star this repo if Kite saves you time.

Built by @thienzz · Issues · Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
dashboard		dashboard
docs		docs
examples		examples
kite		kite
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
requirements_no_vllm.txt		requirements_no_vllm.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🪁 Kite

Why Kite?

The core idea: LLMs don't execute. They propose.

30-second quickstart

Production safety — built in, not bolted on

5 reasoning patterns

Advanced RAG — production retrieval, not toy examples

Prompt A/B testing

Multi-agent conversation

Smart model routing — cut costs 60–80%

Human-in-the-loop workflows

Observability

Works with any LLM

MCP integrations

CLI reference

Examples

Architecture

Roadmap

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🪁 Kite

Why Kite?

The core idea: LLMs don't execute. They propose.

30-second quickstart

Production safety — built in, not bolted on

5 reasoning patterns

Advanced RAG — production retrieval, not toy examples

Prompt A/B testing

Multi-agent conversation

Smart model routing — cut costs 60–80%

Human-in-the-loop workflows

Observability

Works with any LLM

MCP integrations

CLI reference

Examples

Architecture

Roadmap

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages