DSAgent

An AI-powered autonomous agent for data science with persistent Jupyter kernel execution, session management, and conversational interface.

    ____  _____  ___                    __
   / __ \/ ___/ /   | ____ ____  ____  / /_
  / / / /\__ \ / /| |/ __ `/ _ \/ __ \/ __/
 / /_/ /___/ // ___ / /_/ /  __/ / / / /_
/_____//____//_/  |_\__, /\___/_/ /_/\__/
                   /____/

Features

Conversational Interface: Interactive chat with persistent context and sessions
Dynamic Planning: Agent creates and follows plans with step tracking
Persistent Execution: Code runs in a Jupyter kernel with variable persistence across messages
Session Management: Save and resume conversations with full kernel state
Multi-Provider LLM: Supports OpenAI, Anthropic, Google, Ollama via LiteLLM
MCP Tools: Connect to external tools (web search, databases, etc.) via Model Context Protocol
Human-in-the-Loop: Configurable checkpoints for plan and code approval
Notebook Generation: Automatically generates clean, runnable Jupyter notebooks
Agent Skills: Extensible skill system for specialized tasks (EDA, ML, etc.)

Installation

pip install datascience-agent

With optional features:

pip install "datascience-agent[api]"   # FastAPI server support
pip install "datascience-agent[mcp]"   # MCP tools support

For development:

git clone https://github.com/nmlemus/dsagent
cd dsagent
uv sync --all-extras

Docker

Configuration uses the same environment variables as the CLI and server (see Configuration). The container listens on PORT (default 8000).

# Run API server (default: port 8000)
docker run -d -p 8080:8080 \
  -e PORT=8080 \
  -e DSAGENT_DEFAULT_MODEL=gpt-4o \
  -e OPENAI_API_KEY=sk-your-key \
  nmlemus/dsagent:latest

# Run interactive CLI
docker run -it \
  -e OPENAI_API_KEY=sk-your-key \
  -v "$(pwd)/workspace:/workspace" \
  nmlemus/dsagent:latest \
  dsagent chat

# One-shot task
docker run --rm \
  -e OPENAI_API_KEY=sk-your-key \
  -v "$(pwd)/workspace:/workspace" \
  nmlemus/dsagent:latest \
  dsagent run "Analyze data/sales.csv" --data ./data/sales.csv

For Docker deployment details, see docs/DOCKER.md and docs/guide/docker.md.

Quick Start

1. Setup (First Time)

Run the setup wizard to configure your LLM provider:

dsagent init

This will:

Ask for your LLM provider (OpenAI, Anthropic, Google, local, etc.)
Store your API key securely in ~/.dsagent/.env
Automatically select a default model based on provider:
- OpenAI → gpt-4o
- Anthropic → claude-sonnet-4-5
- Google → gemini/gemini-2.5-flash
- Local → ollama/llama3
Optionally configure MCP tools (web search, etc.)

To use a different model, set DSAGENT_DEFAULT_MODEL or LLM_MODEL in ~/.dsagent/.env, or use the --model flag:

dsagent --model gpt-4o-mini

2. Start Chatting

dsagent

This starts an interactive session where you can:

Chat naturally with the agent
Execute Python code with persistent variables
Analyze data files
Generate visualizations
Resume previous sessions

3. One-Shot Tasks

For batch processing or scripts:

dsagent run "Analyze sales trends" --data ./sales.csv

CLI Commands

Command	Description
`dsagent`	Start interactive chat (default)
`dsagent chat`	Same as above, with explicit options
`dsagent run "task"`	Execute a one-shot task
`dsagent serve`	Run REST + WebSocket API server
`dsagent init`	Setup wizard for configuration
`dsagent skills list`	List installed skills
`dsagent skills install <source>`	Install a skill from GitHub or path
`dsagent skills remove <name>`	Remove a skill
`dsagent skills info <name>`	Show skill details
`dsagent mcp list`	List configured MCP servers
`dsagent mcp add <template>`	Add an MCP server from template
`dsagent mcp remove <name>`	Remove an MCP server

Examples

# Interactive chat with specific model
dsagent --model claude-sonnet-4-5

# One-shot analysis
dsagent run "Find patterns in this data" --data ./dataset.csv

# Resume a previous session
dsagent --session abc123

# With MCP tools (web search)
dsagent --mcp-config ~/.dsagent/mcp.yaml

# Human-in-the-loop mode
dsagent --hitl plan

For complete CLI documentation, see docs/CLI.md.

Python API

DSAgent provides two agents for different use cases:

ConversationalAgent (Interactive)

For building chat interfaces and interactive applications:

from dsagent import ConversationalAgent, ConversationalAgentConfig

config = ConversationalAgentConfig(model="gpt-4o")
agent = ConversationalAgent(config)
agent.start()

# Chat with persistent context
response = agent.chat("Load the iris dataset")
print(response.content)

response = agent.chat("Train a classifier on it")
print(response.content)  # Has access to previous variables

agent.shutdown()

PlannerAgent (Batch)

For one-shot tasks and automated pipelines:

from dsagent import PlannerAgent

with PlannerAgent(model="gpt-4o", data="./data.csv") as agent:
    result = agent.run("Analyze this dataset and create visualizations")
    print(result.answer)
    print(f"Notebook: {result.notebook_path}")

For complete API documentation, see docs/PYTHON_API.md.

Supported Models

DSAgent uses LiteLLM to support 100+ LLM providers:

Provider	Models	API Key
OpenAI	`gpt-4o`, `o1`, `o3-mini`	`OPENAI_API_KEY`
Anthropic	`claude-sonnet-4-5`, `claude-opus-4`	`ANTHROPIC_API_KEY`
Google	`gemini-2.5-pro`, `gemini-2.5-flash`	`GOOGLE_API_KEY`
DeepSeek	`deepseek/deepseek-r1`	`DEEPSEEK_API_KEY`
Ollama	`ollama/llama3.2`	None (local)

For detailed model setup, see docs/MODELS.md.

MCP Tools

Connect to external tools via the Model Context Protocol:

# Add web search capability
dsagent mcp add brave-search

# Use it in chat
dsagent --mcp-config ~/.dsagent/mcp.yaml

Available templates: brave-search, filesystem, github, memory, fetch, bigquery

For MCP configuration details, see docs/MCP.md.

Session Management

Sessions persist your conversation history and kernel state:

# List sessions
dsagent chat
> /sessions

# Resume a session
dsagent --session <session-id>

# Export session to notebook
> /export myanalysis.ipynb

Output Structure

Each run creates organized output:

workspace/
└── runs/{run_id}/
    ├── data/           # Input data (copied)
    ├── notebooks/      # Generated Jupyter notebooks
    ├── artifacts/      # Charts, models, exports
    └── logs/           # Execution logs

Included Libraries

DSAgent comes with essential data science libraries pre-installed:

Category	Libraries
Core	numpy, pandas, scipy
DataFrames	polars, pyarrow
Visualization	matplotlib, seaborn, plotly
Machine Learning	scikit-learn, xgboost, lightgbm, pycaret
Feature Selection	boruta
Statistics	statsmodels

Documentation

CLI Reference - All commands: chat, run, serve, init, mcp, skills
Configuration - Environment variables and .env
HTTP API - REST and WebSocket API reference
Python API - ConversationalAgent and PlannerAgent
Model Configuration - LLM provider setup
MCP Tools - External tools integration
Agent Skills - Extensible skill system
Docker Guide - Container deployment

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
docs		docs
src/dsagent		src/dsagent
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DSAgent

Features

Installation

Docker

Quick Start

1. Setup (First Time)

2. Start Chatting

3. One-Shot Tasks

CLI Commands

Examples

Python API

ConversationalAgent (Interactive)

PlannerAgent (Batch)

Supported Models

MCP Tools

Session Management

Output Structure

Included Libraries

Documentation

License

About

Uh oh!

Releases 15

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DSAgent

Features

Installation

Docker

Quick Start

1. Setup (First Time)

2. Start Chatting

3. One-Shot Tasks

CLI Commands

Examples

Python API

ConversationalAgent (Interactive)

PlannerAgent (Batch)

Supported Models

MCP Tools

Session Management

Output Structure

Included Libraries

Documentation

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages