A lightweight, self-contained Python CLI chat system with local AI agents and file operations. This system uses Hyena3-4B-Instruct (Custom Qwen Fork) and llama-cpp-python for inference, with Rich for a beautiful console interface. Features AI agents with workspace-based file operations and dynamic terminal sizing. This has been a really fun side project and a nice addition to the tiny-local-llm ecosystem.
- Auto-Memory System: Conversations auto-save, AI extracts insights, context injects into prompts
- Agentic Tool Loop: AI plans and executes multi-step operations with tool calls
- Local LLM: Uses Hyena3-4B with llama-cpp-python for complete privacy
- Rich Terminal UI: Modern terminal interface with streaming responses and tool panels
- Permission System: Y/N/Always/Never approval for dangerous operations
- No Manual Saves: Everything auto-persists to
.hyena/directory - Modular Architecture: 22 clean modules under 200 lines each
- PEP 8 Compliant: Google standards to make life easy
# Clone and setup
git clone <repository>
cd Hyena-3
uv sync
# Run the application
uv run python -m app.app
# Or use the batch file
run_app.batRequirements:
- Python 3.10+
- UV package manager
- Hyena3-4B model (auto-downloaded)
Every conversation is automatically persisted without manual /save commands.
- Auto-Save: Each message saved to
.hyena/conversations/auto/ - Insight Extraction: Every 5 messages, AI extracts key facts and decisions
- Context Injection: Relevant memories automatically added to system prompt
- Memory Commands:
/memory list,/memory load,/memory delete
When you ask the AI to perform operations:
- Gather: AI analyzes your request and available context
- Plan: Creates step-by-step execution plan with tool calls
- Act: Executes tools (read_file, write_file, bash, etc.)
- Verify: Checks results and retries if needed
- Display: Tool calls and results shown in bordered panels
Dangerous operations require approval:
- Always Safe: Read, Glob, Grep (auto-approve)
- Needs Approval: Write, Edit, Bash, WebFetch (prompt user)
- Modes: auto/smart/always-ask
Hyena-3/
├── app/ # Main application
│ ├── core/ # Chat system, commands, conversations
│ │ ├── chat/ # Chat functionality mixins
│ │ └── commands/ # Command processing system
│ ├── memory/ # Auto-memory, extraction, retrieval
│ │ ├── orchestrator/ # Memory coordination
│ │ ├── project/ # Project-specific memory
│ │ └── retrieval/ # Smart memory retrieval
│ ├── agents/ # Agentic loop, tools, permissions
│ │ ├── loop/ # Gather -> Plan -> Act -> Verify
│ │ └── tools/ # File, shell, workspace tools
│ ├── ui/ # Terminal interface
│ ├── workspace/ # File operations
│ ├── utils/ # Git, terminal helpers
│ └── llm/ # Model engine
├── docs/ # Comprehensive documentation
│ ├── api/ # API reference
│ ├── guides/ # Development guides
│ └── architecture/ # System architecture
└── README.md # This file
Edit app/data/agents.json to customize AI personalities:
{
"agents": [
{
"name": "Code Expert",
"specialty": "Programming and debugging",
"system_prompt": "You are an expert programmer..."
}
]
}- Auto-save: Every message
- Extraction interval: Every 5 messages (configurable)
- Max stored memories: 100 (prunes oldest)
# In app/agents/permission_system.py
class PermissionMode(Enum):
ASK = "ask" # Prompt for dangerous operations
AUTO = "auto" # Auto-accept safe operations, ask for dangerous ones| Command | Description |
|---|---|
/help |
Show all available commands |
/agents |
List available AI personalities |
/switch <n> |
Switch to agent number n |
/memory |
Show memory status and recent conversations |
/memory list |
List all saved conversations |
/memory load <file> |
Load a conversation |
/workspace <dir> |
Set working directory |
/tools |
List available tools |
/agentic |
Agentic loop management |
/session |
Session management |
/compact |
Compact conversation history |
/clear |
Clear screen |
/status |
System status overview |
User Input -> ChatSystem -> AutoMemory -> LLM -> Response
↓
Tool Calls -> AgenticLoop -> Tool Execution
↓
Permission Check -> User Approval (if needed)
Message -> ConversationStore -> Auto-save to disk
↓
Every 5 messages -> MemoryExtractor -> Structured insights
↓
MemoryRetrieval -> Context injection -> Enhanced prompt
Gather Context -> Create Plan -> Execute Tools -> Verify Results
↓ ↓ ↓ ↓
User Input AI Planning Tool Calls Result Check
- 200-line limit per file (strictly enforced)
- Mixin pattern for modular functionality
- PEP 8 compliance (Google standards)
- Single responsibility
- Comprehensive documentation
- Add tool function to appropriate mixin in
app/agents/tools/ - Register in
app/agents/tools/base.py - Set permission level in
app/agents/permission_system.py
- Create command method in appropriate mixin in
app/core/commands/ - Register in
app/core/commands/base.py - Update help text and autocomplete
Edit components in app/ui/:
banner.py- Welcome banner ASCII artpanels.py- Tool call/result renderingprompt.py- Input handling and autocomplete
# Syntax check all files
python -m py_compile app/core/chat.py app/ui/tui.py app/memory/orchestrator.py
# Run the app
uv run python -m app.app
# Test specific components
python -c "from app.agents.loop import AgenticLoop; print('AgenticLoop imports')"
python -c "from app.memory.orchestrator import AutoMemoryOrchestrator; print('MemorySystem imports')"- API Reference - Complete API documentation
- Architecture Guide - System design and patterns
- Development Guide - Contributing and extending
- Memory System - Auto-memory deep dive
- Agent System - Agentic loop documentation
- Fork the repository
- Create a feature branch
- Follow code quality standards (200-line limit, PEP 8)
- Add tests for new functionality
- Update documentation
- Submit a pull request
MIT License - See LICENSE file
- Documentation - Complete documentation
- Issues - Bug reports and feature requests
- Discussions - Community discussions
- Releases - Version history




