AI-Powered Voice & Text Agent Platform for Enterprises
Build, deploy, and manage conversational AI agents in minutesβnot months. Simply provide your SOPs and knowledge base, and Oratio automatically generates production-ready agents.
| Topic | Description |
|---|---|
| π€ AgentCreator Pipeline | How the meta-agent generates custom agents |
| ποΈ Voice Agents | How voice agents work with AWS Nova Sonic |
| π Deployment | Infrastructure setup and CI/CD pipeline |
| ποΈ Architecture | Detailed system design |
Oratio is a multi-tenant SaaS platform that enables enterprises to create AI agents without writing code. The platform consists of three main components:
- Frontend Dashboard - Next.js application for agent management
- Backend API - FastAPI service handling authentication and orchestration
- Agent Infrastructure - AWS-based agent creation and runtime system
Traditional AI agent deployment requires:
- Manual coding for each agent
- Separate infrastructure per agent
- Complex deployment pipelines
- Weeks of development time
Oratio automates the entire process:
- Upload SOP + knowledge base documents
- AgentCreator meta-agent generates custom code
- Chameleon runtime loads agents dynamically
- Deploy unlimited agents with one infrastructure
graph TB
subgraph "User Interface"
A[Enterprise Dashboard]
end
subgraph "Backend Services"
B[FastAPI Backend]
C[AWS Cognito Auth]
end
subgraph "Agent Creation Pipeline "
D[Step Functions Workflow]
E[AgentCreator Meta-Agent]
F[Knowledge Base Provisioner]
end
subgraph "Storage Layer"
G[DynamoDB Tables]
H[S3 Buckets]
I[Bedrock Knowledge Bases]
end
subgraph "Runtime Layer"
J[Chameleon Generic Loader]
K[AgentCore Memory]
end
subgraph "End Users"
L[Text Chat Customers]
M[Voice Call Customers]
end
A -->|Create Agent| B
B -->|Authenticate| C
B -->|Trigger| D
D -->|Invoke| F
F -->|Pass New KB ID| E
E -->|Store New Code| H
E -->|Store Metadata| G
F -->|Create KB| I
L -->|Chat Request| B
M -->|Voice Request| B
B -->|Invoke Bedrock AgentCore| J
J -->|Load Code| H
J -->|Load Memory| K
J -->|Query KB| I
B --> |Backed db's| G
Traditional approach: One AgentCore deployment per agent (slow, expensive)
Oratio approach: One generic loader for unlimited agents
graph LR
subgraph "Traditional Approach"
A1[Agent 1 Code] --> B1[AgentCore Runtime 1]
A2[Agent 2 Code] --> B2[AgentCore Runtime 2]
A3[Agent 3 Code] --> B3[AgentCore Runtime 3]
end
subgraph "Oratio Chameleon Approach"
C1[Agent 1 Code]
C2[Agent 2 Code]
C3[Agent 3 Code]
C1 --> D[S3 Storage]
C2 --> D
C3 --> D
D --> E[Chameleon Loader]
E -->|Loads Dynamically| F[Single AgentCore Runtime]
end
Benefits:
- β Sub-second agent creation (no deployment wait)
- β Cost-effective scaling (one runtime for all agents)
- β Per-agent memory isolation
- β Instant updates (just update S3 code)
sequenceDiagram
participant User as Enterprise User
participant UI as Dashboard
participant API as Backend API
participant SF as Step Functions
participant ACL as AgentCreator
participant KB as KB Provisioner
participant S3 as S3 Storage
participant DB as DynamoDB
participant AC as Bedrock AgentCore
User->>UI: Upload SOP + Documents
UI->>API: POST /agents
API->>DB: Store Agent Metadata
API->>S3: Upload Documents
API->>SF: Start Workflow
SF->>KB: Provision Knowledge Base
KB->>S3: Index Documents
KB-->>SF: KB ARN
SF->>ACL: Generate Agent Code
ACL->> AC: Invoke bedrockAgentcore
AC->>AC: Parse SOP
AC->>AC: Design Architecture
AC->>AC: Generate Strands Code
AC->>AC: Review & Validate
AC->>ACL: Retrun agent.py
ACL->>S3: Store agent_file.py
AC-->>SF: Success
SF->>DB: Update Agent Status
SF-->>API: Workflow Complete
API-->>UI: Agent Ready
The meta-agent that generates custom agents:
graph LR
A[SOP Input] --> B[Parse Requirements]
B --> C[Draft Architecture Plan]
C --> D[Review Plan]
D -->|Needs Revision| C
D -->|Approved| E[Generate Code]
E --> F[Review Code]
F -->|Needs Revision| E
F -->|Approved| G[Generate Prompts]
G --> H[Deploy to S3]
Pipeline Stages:
- SOP Parser - Extracts business rules and requirements
- Plan Drafter - Designs single or multi-agent architecture
- Plan Reviewer - Validates architecture (up to 3 iterations)
- Code Generator - Writes production-ready Strands agent code
- Code Reviewer - Validates syntax and best practices
- Prompt Generator - Creates optimized system prompts
sequenceDiagram
participant Customer as End Customer
participant API as Backend API
participant Cham as Chameleon
participant S3 as S3 Storage
participant Agent as Strands Agent
participant KB as Knowledge Base
participant Mem as Memory
Customer->>API: POST /chat/{agent_id}/{session_id}
API->>Cham: Invoke Agent
Cham->>S3: Retrieve agent_file.py
Cham->>Mem: Load Conversation History
Cham->>Agent: Execute with Context
Agent->>KB: Retrieve Information
KB-->>Agent: Relevant Documents
Agent->>Agent: Process with Tools
Agent->>Mem: Save Turn
Agent-->>Cham: Response
Cham-->>API: Result
API-->>Customer: Agent Response
sequenceDiagram
participant Customer as End Customer
participant Voice as Voice Service
participant Nova as Nova Sonic
participant Cham as Chameleon
participant Agent as Strands Agent as tool {agent_id}
Customer->>Voice: ws://{agent_id}/{session_id}
Voice->>Nova: Start Session
loop Conversation
Customer->>Voice: Audio Stream
Voice->>Nova: Audio Input
alt Needs Business Logic
Nova->>Cham: Invoke Agent Tool
Cham->>Agent: Execute
Agent-->>Cham: Response
Cham-->>Nova: Result
end
Nova->>Nova: Generate Speech
Nova->>Voice: Audio Output
Voice->>Customer: Audio Stream
end
- Upload SOP and knowledge base documents
- AgentCreator automatically designs optimal architecture
- Generates production-ready code in seconds
- No manual coding or deployment required
- DSPy Framework - Optimized LLM prompting
- LangGraph Orchestration - Multi-stage pipeline with quality gates
- MCP Tools - Accesses Strands and AgentCore documentation
- Iterative Refinement - Reviews and improves generated code
- Multi-Agent Support - Generates single or multi-agent architectures
- Generic Loader - One deployment for unlimited agents
- S3-Based Loading - Loads agent code on-demand
- Memory Injection - Automatic conversation history
- Session Isolation - Per-agent, per-customer separation
- Text Chat - REST API for web/mobile applications
- Voice Calls - WebSocket + AWS Nova Sonic for phone calls
- Unified Backend - Same agent code for both modes
- AgentCore Memory API - Persistent conversation history
- Automatic Context Loading - Last 10 turns loaded on init
- Multi-Session Support - Multiple conversations per customer
- 30-Day Retention - Configurable retention policies
- User Isolation - Strict tenant separation in DynamoDB
- API Key Management - Scoped keys per agent
- Cognito Authentication - Secure user management
- Role-Based Access - CHAT, VOICE, ADMIN permissions
graph TB
subgraph "Frontend Layer"
A[Next.js 15]
B[TypeScript]
C[Tailwind CSS]
D[shadcn/ui]
end
subgraph "Backend Layer"
E[FastAPI]
F[Python 3.11]
G[Pydantic]
H[boto3]
end
subgraph "Agent Creation"
I[DSPy]
J[LangGraph]
K[Bedrock Nova Pro]
L[MCP Tools]
end
subgraph "Generated Agents"
M[Strands SDK]
N[strands-tools]
O[AgentCore Memory]
end
subgraph "AWS Infrastructure"
P[CDK Python]
Q[Cognito]
R[DynamoDB]
S[S3]
T[Lambda]
U[Step Functions]
V[Bedrock]
end
| Layer | Technologies |
|---|---|
| Frontend | Next.js 15, TypeScript, Tailwind CSS, shadcn/ui |
| Backend | FastAPI, Python 3.11, Pydantic, boto3 |
| Agent Creation | DSPy, LangGraph, Bedrock Nova Pro, MCP Tools |
| Generated Agents | Strands SDK, strands-tools, AgentCore Memory |
| Infrastructure | AWS CDK, Cognito, DynamoDB, S3, Lambda, Step Functions |
| LLM Models | Nova Pro (text), Nova Sonic (voice), Claude (fallback) |
| CI/CD | GitHub Actions, Docker, ECR |
- User authentication and registration
- Agent creation wizard
- Knowledge base management
- API key generation
- Session monitoring (future)
- RESTful endpoints for agent management
- Cognito integration for authentication
- JWT token validation
- WebSocket support for voice (future)
- Chat endpoint with API key validation
- DSPy-powered code generation
- LangGraph workflow orchestration
- MCP documentation tools
- Syntax validation
- Multi-agent pattern support
- Dynamic agent code loading from S3
- Memory hook injection
- Session state management
- Tool execution environment
- DynamoDB tables (users, agents, knowledge bases, API keys)
- S3 buckets (documents, generated code)
- Lambda functions (KB provisioner, AgentCreator invoker)
- Step Functions (agent creation workflow)
- Cognito User Pool (authentication)
- IAM roles and policies
graph TB
subgraph "GitHub"
A[Code Repository]
end
subgraph "CI/CD Pipeline"
B[GitHub Actions]
C[Docker Build]
D[ECR Push]
end
subgraph "AWS Infrastructure"
E[CDK Deployment]
F[CloudFormation]
end
subgraph "Deployed Services"
G[Frontend Vercel]
H[Backend ECS/Lambda]
I[AgentCore Runtimes]
end
A -->|Push to main| B
B --> C
C --> D
B --> E
E --> F
F --> G
F --> H
F --> I
Deployment Process:
- Code pushed to GitHub main branch
- GitHub Actions triggers CI/CD pipeline
- Docker images built for backend and agent services
- Images pushed to AWS ECR
- CDK deploys infrastructure (DynamoDB, S3, Lambda, etc.)
- AgentCore runtimes deployed (Chameleon, AgentCreator)
- Frontend deployed to Vercel/Amplify
- Authentication - AWS Cognito with JWT tokens
- Authorization - API keys with scoped permissions
- Data Isolation - Multi-tenant DynamoDB design
- Encryption - S3 encryption at rest, TLS in transit
- Secrets Management - AWS Secrets Manager
- Audit Logging - CloudWatch Logs and X-Ray tracing
- CORS - Configured for production origins only
- Horizontal Scaling - Chameleon handles concurrent requests
- Cost Optimization - Pay-per-invocation model
- Memory Efficiency - Shared runtime for all agents
- Storage - S3 for unlimited agent code storage
- Database - DynamoDB on-demand scaling
Phase: MVP Development
Completed:
- β Frontend dashboard with authentication
- β Backend API with Cognito integration
- β AgentCreator meta-agent pipeline
- β Chameleon generic loader
- β AWS infrastructure (CDK)
- β CI/CD pipeline (GitHub Actions)
- β Text chat functionality
- β Conversation memory system
- β Voice agent integration (Nova Sonic)
In Progress:
- π§ Realtime call client transcriptions
- π§ Analytics dashboard
- AgentCreator Pipeline - Meta-agent workflow and LangGraph orchestration
- Voice Agents - Detailed voice agent architecture
