Skip to content

RoboFinSystems/robosystems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,972 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

RoboSystems

RoboSystems is a financial intelligence platform that connects disparate data sources, builds domain ontologies as knowledge graphs, and provides AI-powered tools for accounting, financial reporting, investment management, and analysis. It powers RoboLedger and RoboInvestor.

  • LadybugDB Graph Database: Embedded columnar graph database with native DuckDB staging, LanceDB vector search, and tiered infrastructure
  • Extensions: Domain schemas that drive OLTP tables, API routes, data pipelines, and dedicated frontend apps. Extensions share a single database with schema-per-tenant isolation and materialize to the graph
  • Document Search: Full-text and semantic search across SEC filings, uploaded documents, and connected sources via OpenSearch
  • AI-Native Architecture: Context graphs with embeddings, semantic enrichment, and confidence scoring for LLM-powered analytics
  • Model Context Protocol (MCP): Standardized server and client for LLM integration with schema-aware tools
  • Multi-Source Data Integration: SEC XBRL filings, QuickBooks accounting data via dbt pipelines, and custom financial datasets
  • Enterprise-Ready Infrastructure: Multi-tenant architecture with tiered scaling and production-grade query management
  • Developer-First API: RESTful API designed for integration with financial applications

Platform

The platform provides the core infrastructure that all extensions build on:

  • Dedicated Infrastructure: Tiered graph infrastructure with dedicated instances and configurable memory allocation
  • Subgraphs (Workspaces): AI memory graphs, data workspaces with fork & publish, and isolated environments for development and team collaboration
  • AI Agent Interface: Natural language financial analysis with text-to-Cypher via Model Context Protocol (MCP)
  • Shared Repositories: SEC XBRL filings knowledge graph for context mining and benchmarking
  • Document Management: Upload, index, and search documents with full-text and semantic search via OpenSearch
  • DuckDB Staging System: High-performance data validation and bulk ingestion pipeline
  • Dagster Orchestration: Data pipeline orchestration for SEC filings, QuickBooks sync, backups, billing, and scheduled jobs
  • Credit-Based Billing: Flexible credits for AI operations based on token usage

Extensions

Each extension defines a domain schema and provides OLTP tables, API routes, data pipelines, and a dedicated frontend app. All extensions share a single PostgreSQL database with schema-per-tenant isolation and materialize to the graph. See Schema Extensions for details.

Accounting and financial reporting extension. OLTP general ledger with schema-per-tenant PostgreSQL (accounts, transactions, journal entries, line items, dimensions), QuickBooks ELT pipeline via dbt/Dagster, SEC XBRL financial reporting, and chart of accounts.

Portfolio management and investment tracking extension with securities, positions, trades, benchmarks, market data, and risk. Dedicated frontend app. OLTP database and API routes planned.

Quick Start

Docker Development Environment

# Install uv and just
brew install uv just

# Start robosystems backend api
just start

# Start frontend apps - robosystems-app, roboledger-app, roboinvestor-app
just start apps

This initializes the .env file and starts the complete RoboSystems stack with:

  • Graph API with LadybugDB and DuckDB backends
  • Dagster for data pipeline orchestration
  • PostgreSQL for graph metadata, IAM and Dagster
  • Valkey for caching, SSE messaging, and rate limiting
  • OpenSearch for full-text and semantic document search
  • Localstack for S3 and DynamoDB emulation

Service URLs:

Service URL
Main API http://localhost:8000
Graph API http://localhost:8001
Dagster UI http://localhost:8002

With just start apps (frontend apps):

App URL
RoboSystems App http://localhost:3000
RoboLedger App http://localhost:3001
RoboInvestor App http://localhost:3002

Local Development

# Setup Python environment (uv automatically handles Python versions)
just init

Examples

See RoboSystems in action with runnable demos that create graphs, load data, and execute queries with the robosystems-client:

just demo-sec               # Loads NVIDIA's SEC XBRL data via Dagster pipeline
just demo-accounting        # Creates chart of accounts with 6 months of transactions
just demo-custom-graph      # Builds custom graph schema with relationship networks

Each demo has a corresponding Wiki article with detailed guides.

Development Commands

Testing

just test-all               # Tests with code quality
just test                   # Default test suite
just test adapters          # Test specific module
just test-cov               # Tests with coverage

Log Monitoring

just logs container=api                 # View API logs (last 100 lines)
just logs container=graph-api           # View Graph API logs (last 100 lines)
just logs container=dagster-webserver   # View Dagster Webserver logs
just logs container=dagster-daemon      # View Dagster Daemon logs

See justfile for 50+ development commands including database migrations, CloudFormation linting, graph operations, administration, and more.

Prerequisites

System Requirements

  • Docker & Docker Compose
  • 8GB RAM minimum
  • 20GB free disk space

Required Tools

  • uv for Python package and version management
  • just for project command runner

Deployment Requirements

  • Fork this repo
  • AWS account with IAM Identity Center (SSO)
  • Run just bootstrap to configure OIDC and GitHub variables

See the Bootstrap Guide for complete instructions.

Architecture

RoboSystems is built on a modern, scalable architecture with:

Application Layer:

  • FastAPI REST API with versioned endpoints
  • Extension API routes feature-flagged per module
  • MCP Server for AI-powered graph database access with schema-aware tools
  • Agent Interface for text-to-Cypher natural language queries
  • Dagster for data pipeline orchestration and background jobs

LadybugDB Graph Database: (configuration)

  • Embedded columnar graph database purpose-built for financial analytics
  • Base + extension schema architecture — extensions define domain models
  • Native DuckDB integration for high-performance staging and ingestion
  • LanceDB vector search for semantic element resolution (IVF-PQ indexes, 384-dim embeddings)
  • Tiered infrastructure with configurable memory, rate limits, and subgraph allocations
  • Shared tier hosts public repositories with read replicas

Data Layer:

  • PostgreSQL for IAM, graph metadata, Dagster, and extension OLTP databases (schema-per-tenant)
  • OpenSearch for full-text and semantic document search (BM25 + KNN)
  • Valkey for caching, SSE messaging, and rate limiting
  • AWS S3 for data lake storage and static assets
  • DynamoDB for instance/graph/volume registry

Infrastructure:

  • ECS Fargate for API and Dagster
  • EC2 ASG for LadybugDB writer clusters
  • EC2 ALB + ASG for LadybugDB shared replica clusters
  • RDS PostgreSQL + ElastiCache Valkey
  • OpenSearch for full-text and semantic document search
  • CloudFormation infrastructure deployed via GitHub Actions with OIDC

For detailed architecture documentation, see the Architecture Overview in the Wiki.

SEC Shared Repository

A curated knowledge graph of US public company financial data from SEC EDGAR XBRL filings. Runs on the shared LadybugDB tier, accessible via MCP tools, Cypher queries, and the AI agent.

  • Pipeline: EDGAR → Download → Process (Parquet) → Stage (DuckDB) → Enrich (fastembed) → Materialize (LadybugDB) → Index + Embed (OpenSearch)
  • Graph: 14 node types and 24 relationship types modeling the full XBRL reporting hierarchy
  • Search: Hybrid BM25 + KNN vector search across XBRL text blocks, narrative sections, and iXBRL disclosures
  • Enrichment: Semantic element mapping, statement classification, and disclosure tagging via the Seattle Method taxonomy
just sec-load NVDA 2025  # Load NVIDIA filings for 2025
just sec-health          # Check SEC database health

See SEC Adapter and SEC Pipeline for detailed documentation.

AI

Model Context Protocol (MCP)

  • Financial Analysis: Natural language queries across enterprise data and public benchmark data
  • Cross-Database Queries: Compare user graph data against SEC shared repository data
  • Tools: Rich toolkit for graph queries, schema introspection, fact discovery, financial analysis, document search, and AI memory operations
  • Handler Pool: Managed MCP handler instances with resource limits

Agent System

  • Multi-agent architecture with intelligent routing
  • Dynamic agent selection based on query context
  • Parallel query processing with context-aware responses
  • Extensible framework for custom domain expertise

Credit System

  • AI Operations Only: Credits are consumed exclusively by AI agent calls (Anthropic Claude via AWS Bedrock)
  • Token-Based Billing: ~1-2 credits per text-to-Cypher call based on actual token usage and cost
  • MCP Tool Access: No credits consumed for external MCP calls not using agent-based tools

Client Libraries

RoboSystems provides comprehensive client libraries for building applications:

MCP (Model Context Protocol) Client

AI integration client for connecting Claude and other LLMs to RoboSystems.

npx -y @robosystems/mcp
  • Features: Claude Desktop integration, natural language queries, graph traversal, financial analysis
  • Use Cases: AI agents, chatbots, intelligent assistants, automated research
  • Documentation: npm | GitHub

TypeScript/JavaScript Client

Full-featured SDK for web and Node.js applications with TypeScript support.

npm install @robosystems/client
  • Features: Type-safe API calls, automatic retry logic, connection pooling, streaming support
  • Use Cases: Web applications, Node.js backends, React/Vue/Angular frontends
  • Documentation: npm | GitHub

Python Client

Native Python SDK for backend services and data science workflows.

pip install robosystems-client
  • Features: Async/await support, pandas integration, Jupyter compatibility, batch operations
  • Use Cases: Data pipelines, ML workflows, backend services, analytics
  • Documentation: PyPI | GitHub

Documentation

User Guides (Wiki)

Developer Documentation (Codebase)

Core Services:

Graph Database System:

Middleware Components:

Infrastructure:

Development Resources:

  • Examples - Runnable demos and integration examples
  • Tests - Testing strategy and organization
  • Admin Tools - Administrative utilities and cli

Security & Compliance:

  • SECURITY.md - Security features and compliance configuration

API Reference

Support

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Apache-2.0 © 2026 RFS LLC

About

RoboSystems is a financial intelligence platform that unifies structured data, document search, and AI memory to transform complex financial data into actionable intelligence. Fork-ready with full GitHub Actions CI/CD for deploying CloudFormation infrastructure to your AWS account.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors

Languages