Skip to content

vaishnavi33/agentic-ai-operating-system

Repository files navigation

Agentic AI Operating System

A systems-design-focused AI orchestration framework that simulates an operating system for AI agents. Users submit tasks; a kernel orchestrates execution; specialized agents perform subtasks; memory is managed centrally; outputs are evaluated and refined.

This is not a chatbot — it is an explainable, traceable multi-agent execution platform with MCP-style tool routing.


Project Goals

  • Demonstrate AI OS concepts: kernel, scheduling, processes, memory, IPC
  • Provide modular, extensible agent architecture
  • Enable full execution traces for observability
  • Support self-reflective reasoning via critic + refinement loops
  • Ship an MVP that runs locally with no external APIs

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         User Task                                │
└────────────────────────────┬────────────────────────────────────┘
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│                      MCP Kernel                                  │
│  Context Manager │ Process Registry │ Tool Router │ Traces      │
└────────────────────────────┬────────────────────────────────────┘
                             ▼
        ┌────────────────────┼────────────────────┐
        ▼                    ▼                    ▼
┌───────────────┐   ┌─────────────────┐   ┌──────────────────┐
│  Scheduler    │   │ Process Manager │   │ Memory Manager   │
│  Agent        │   │ Agent           │   │ Agent            │
└───────┬───────┘   └────────┬────────┘   └────────┬─────────┘
        │                    │                     │
        └────────────────────┼─────────────────────┘
                             ▼
              ┌──────────────────────────────┐
              │  Worker Agents               │
              │  Research │ Analysis │ Summary│
              └──────────────┬───────────────┘
                             ▼
              ┌──────────────────────────────┐
              │  Critic Agent                │
              │  (quality scoring)           │
              └──────────────┬───────────────┘
                             ▼
              ┌──────────────────────────────┐
              │  Refinement Loop (optional)  │
              └──────────────┬───────────────┘
                             ▼
              ┌──────────────────────────────┐
              │  Validator Agent             │
              └──────────────┬───────────────┘
                             ▼
              ┌──────────────────────────────┐
              │  Final Validated Output      │
              └──────────────────────────────┘

Agent Flow

Stage Component Responsibility
1 Task Parser Extract intent, entities, phases
2 MCP Kernel Register task, route tools, log traces
3 Scheduler Prioritize and order execution phases
4 Process Manager Split into sub-processes with dependencies
5 Memory Manager Load context, persist outputs
6 Workers Execute research, analysis, summarization
7 Critic Score completeness, reasoning, structure, evidence
8 Refinement Re-run weak phases (max 2 rounds)
9 Validator Final structure and consistency checks

MCP Kernel

The kernel acts as the AI operating system core:

  • Context registration — per-task execution state
  • Process registry — track sub-process lifecycle
  • Execution router — MCP-style tool dispatch (parse_task, search, summarize)
  • Shared memory — short-term (session) and long-term (knowledge)
  • Event loggingExecutionTrace for every kernel event

Memory System

Type Module Purpose
Short-term memory/short_term_memory.py Active context, worker outputs
Long-term memory/long_term_memory.py Completed tasks, reusable summaries

API (via MemoryManagerAgent):

  • save_memory(key, value, scope=...)
  • retrieve_memory(key, scope=...)
  • clear_memory(scope=...)

Project Structure

agentic_ai_operating_system/
├── agents/           # Scheduler, workers, critic, validator, orchestrator
├── kernel/           # MCP kernel, context, router
├── memory/           # Short/long-term stores
├── tools/            # Task parser, search, summarizer
├── tasks/            # Sample user tasks
├── logs/             # Runtime logs
├── models.py         # Pydantic domain models
├── utils.py          # Logging and formatting
└── main.py           # CLI entry point

Installation

Requires uv and Python 3.11+.

cd agentic_ai_operating_system
uv sync
uv run python main.py

Run a custom task:

uv run python main.py "Research NVIDIA earnings and summarize key AI market risks."

Sample Execution Trace

==================================================
AI OPERATING SYSTEM TRACE
==================================================

[TASK]
User Request: Research NVIDIA earnings and summarize key AI market risks.

[SCHEDULER]
Execution Order: ['research', 'analyze', 'summarize', 'critique', 'validate']

[PROCESS MANAGER]
Created Processes: [ Research NVIDIA, Analyze findings, Summarize results, ... ]

[MEMORY]
Loaded Context: { prior_knowledge_count, short_term_keys, ... }

[WORKER OUTPUT]
[ research_worker, analysis_worker, summary_worker outputs ... ]

[CRITIC]
Score: 0.82
Feedback: [ ... ]

[VALIDATOR]
Validation Passed

==================================================
FINAL OUTPUT
==================================================
{
  "summary": "...",
  "confidence": 0.92,
  "execution_steps": [...],
  "memory_used": [...],
  "critic_score": 0.90
}

Design Principles

  • Modularity — agents and tools are independently replaceable
  • Orchestration — single orchestrator coordinates pipeline stages
  • Explainability — structured traces at every step
  • Traceability — kernel ExecutionTrace + file logs
  • Extensibility — add workers, tools, or persistence without rewriting kernel

Future Improvements

  • Real MCP server transport (stdio/HTTP) for tool routing
  • LLM-backed workers with structured output schemas
  • Persistent long-term memory (vector DB / SQLite)
  • Parallel worker execution via async process pool
  • Priority queues and preemption in scheduler
  • Distributed agent runtime and health monitoring
  • Web dashboard for live trace visualization
  • Pluggable critic models and human-in-the-loop validation

License

MIT — use freely for learning and extension.

About

Production-grade MCP-based multi-agent orchestration system with centralized kernel execution, shared memory, and lifecycle management

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages