AgenticSciML

Multi-agent evolutionary framework for automated scientific machine learning discovery.

Coordinates a swarm of specialized LLM agents that propose, debate, implement, and evaluate scientific computing experiments — evolving solutions through structured debate and evolutionary search.

Based on AgenticSciML (Jiang & Karniadakis, 2025) with swarm cost optimization from Flexible Swarm Learning (Samadi & Schuppert, 2025).

How It Works

Instead of a human manually tuning parameters and architectures, AgenticSciML runs an evolutionary loop where each generation passes through a pipeline of 8 specialized agents:

graph TD
    O[ORCHESTRATOR<br/>Evolutionary Loop Control]
    O --> T[SOLUTION TREE<br/>Exploit / Explore]
    O --> C[COST TRACKER<br/>Swarm Budget]
    O --> K[KNOWLEDGE BASE<br/>YAML Techniques]

Per-Mutation Agent Pipeline

Step	Agent	Task	Model
1	DataAnalyst	Analyze result history, find patterns	Haiku
2	Retriever	Select technique from knowledge base	Haiku
3	Proposer / Critic	N-round structured debate (configurable, default 4)	Sonnet + Haiku
4	Engineer	Write complete experiment.py	Sonnet
5	Sandbox	Execute locally or via Slurm	local / GPU / HPC
6	Debugger	Fix crashes from stderr (up to 3 retries)	Haiku
7	Tree.add()	Record score, persist to tree.json	—

The solution tree branches over generations, balancing exploitation of the best-scoring experiments with exploration of untested parameter regions.

Swarm Cost Optimization

Following the swarm learning insight that ensembles of smaller specialized agents can outperform monolithic large models:

Model	Usage	Role
Haiku	~80% of calls	Analysis, retrieval, critique, debugging, voting
Sonnet	~20% of calls	Creative proposal generation, code writing
Opus	Escalation only	Reserved for complex failures

Typical cost per evolutionary generation: $0.05 – $0.50

Quick Start

# Install
git clone https://github.com/m9h/agentsciml.git
cd agentsciml
uv sync --all-extras

# Set your API key
export ANTHROPIC_API_KEY="sk-ant-..."

# Run evolutionary search
agentsciml run --project ~/dev/quantum-cognition --budget 5.0 --generations 20

# Check solution tree status
agentsciml status --project ~/dev/quantum-cognition

Architecture

src/agentsciml/
├── orchestrator.py      Evolutionary loop (init → root → tree expansion)
├── agents.py            8 agent roles with model-tier routing
├── swarm.py             Multi-project parallel orchestration with GitHub sync
├── tree.py              Solution tree with exploitation/exploration selection
├── knowledge.py         YAML-based technique knowledge base
├── sandbox.py           Local subprocess or remote Slurm execution
├── cost.py              Token tracking and budget enforcement
├── protocols.py         Pydantic models for typed inter-agent documents
├── cli.py               Click CLI (run, status)
└── adapters/
    ├── base.py          Abstract ProjectAdapter interface
    ├── qcccm.py         Quantum cognition (VQE, QAOA, spin glasses)
    ├── dmipy.py         Diffusion MRI microstructure
    ├── parameter_golf.py  GPT training (OpenAI parameter-golf)
    └── meta.py          Meta-architecture optimization (orchestrator tuning)

Agent Roles

Agent	Model	Purpose
DataAnalyst	Haiku	Summarize results history, identify patterns and unexplored regions
Retriever	Haiku	Select 0–1 techniques from curated knowledge base
Proposer	Sonnet	Creative reasoning via structured debate
Critic	Haiku	Challenge proposals, find flaws, assess feasibility
Engineer	Sonnet	Write valid, complete experiment.py code
Debugger	Haiku	Fix crashes using stderr, up to 3 retries
ResultAnalyst	Haiku	Evaluate and compare experiment results
SelectorEnsemble	3x Haiku	Diverse voting for next-generation parent selection

Key Design Decisions

Document-passing, not chat history — each agent call gets freshly assembled context documents, no unbounded memory growth
Append-only tree — all experiments persisted in tree.json, never deleted or modified
RESULT| contract — experiments emit structured RESULT|key=val|... lines for mechanical score parsing
No framework — the orchestrator is a plain Python loop; no LangChain, no LlamaIndex
Dynamic adapter loading — Orchestrator.load_adapter(path) discovers and instantiates any ProjectAdapter subclass from a file, enabling external projects to plug in without modifying the core package

Swarm Orchestration

The swarm manager (swarm.py) runs multiple scientific projects in parallel, each with its own adapter, knowledge base, and Slurm configuration:

# swarm.yaml
projects:
  - name: "qcccm"
    path: "/home/user/dev/quantum-cognition"
    repo_url: "https://github.com/m9h/quantum-cognition.git"
    slurm:
      partition: "gpu"
      gres: "gpu:1"

meta:
  debate_rounds: 6
  max_concurrent_slurm_jobs: 10

# Launch all projects
python -m agentsciml.swarm --config swarm.yaml

Each project is automatically synced from GitHub, its adapter loaded dynamically, and experiments dispatched to local subprocess or remote Slurm (DGX Spark via SSH).

Meta-Architecture Optimization

The MetaSciMLAdapter treats the orchestrator's own settings (debate rounds, budget, model tier assignments) as the search space, running inner loops on a target project and optimizing for efficiency = best_score / cost. This enables automated tuning of the framework itself.

Application Areas

AgenticSciML targets JAX-based scientific computing projects with a loss → gradient → optimize loop. Each project needs a thin adapter (~50 lines) mapping its experiment interface to the framework.

Quantum Cognition — qcccm

github.com/m9h/quantum-cognition

Quantum cognition library exploiting the Hamiltonian isomorphism between disordered magnets and multi-agent social systems. JAX + PennyLane.

Targets: VQE ansatz discovery, QAOA depth-vs-performance tradeoffs, solver meta-selection (PIMC → VQE → QAOA), Trotter number optimization, transverse field annealing schedules.

Metric: quantum_advantage = (E_classical - E_quantum) / |E_exact| (maximize)

OpenAI Parameter Golf — parameter-golf

github.com/openai/parameter-golf

Train the best possible LLM within hard constraints: 16 MB compressed artifact, 10-minute training on 8xH100, no external data.

Targets: Architecture (vocab, depth, width, GQA, layer sharing), quantization (INT5/INT6/INT8, QAT), tokenizer (SentencePiece, BigramHash), optimizer (Muon/Adam), training schedule (SWA, cosine, warmup), compression (sparsity, low-rank).

Metric: bits_per_byte on FineWeb validation (minimize)

Diffusion MRI Microstructure — dmipy

github.com/AthenaEPI/dmipy

Open-source toolbox for brain tissue microstructure estimation from diffusion MRI. Multi-compartment modeling with modular architecture.

Targets: Compartment model selection, neural posterior estimation architecture search (MLP/E3/Flow), orientation distribution optimization, acquisition protocol design.

Metric: fiber_orientation_error in degrees (minimize)

Differentiable Control — jaxctrl (planned)

github.com/m9h/jaxctrl

Differentiable control theory in JAX: Lyapunov/Riccati solvers, LQR, SINDy/DMD/Koopman.

Targets: SINDy hyperparameter tuning, operator basis discovery for Koopman learning, multi-system joint identification.

Active Inference — alf (planned)

github.com/m9h/alf

Standalone JAX-native active inference library with differentiable HMM learning and expected free energy.

Targets: Generative model structure search, EFE horizon optimization, precision scheduling.

Evolutionary Robotics — evo-embodied (planned)

GPU-accelerated evolutionary robotics via MuJoCo-MJX + JAX. 100–1000x speedup over PyBullet.

Targets: Fitness function design, morphology parameterization, neural controller architecture search.

Writing a Project Adapter

from agentsciml.adapters.base import ProjectAdapter

class MyProjectAdapter(ProjectAdapter):
    def get_context(self) -> str:
        """Project description and research goals."""

    def get_results_history(self) -> str:
        """Accumulated experimental results (TSV/CSV)."""

    def get_current_experiment(self) -> str:
        """Current experiment.py code."""

    def get_available_api(self) -> str:
        """API surface the Engineer agent must use."""

    def get_metric_name(self) -> str:
        """Primary metric name (e.g. 'quantum_advantage')."""

    def get_result_metric_key(self) -> str:
        """Key in RESULT| lines for the primary metric."""

    def parse_score(self, result_lines: list[str]) -> float:
        """Extract primary metric from experiment output."""

    # Optional overrides:
    def get_score_direction(self) -> str:
        """'maximize' (default) or 'minimize'."""

    def get_constraints(self) -> str:
        """Domain-specific hard constraints for the Critic agent."""

Deployment

Target	Command	Notes
Local	`agentsciml run -p ~/project`	CPU or local GPU
Swarm	`python -m agentsciml.swarm --config swarm.yaml`	Parallel multi-project
RunPod	`make gpu-up && make gpu-run`	A100 cloud GPU
DGX Spark	`sbatch scripts/slurm_run.sh`	HPC cluster via Slurm
Docker	`docker build -t agentsciml .`	Containerized

Development

make test          # Run test suite
make lint          # Ruff linter
make fmt           # Ruff formatter
make cov           # Coverage report (>60% threshold)

References

Core

Jiang, Q. & Karniadakis, G. E. (2025). AgenticSciML: Collaborative Multi-Agent Systems for Emergent Discovery in Scientific Machine Learning. arXiv:2511.07262
Samadi, M. E. & Schuppert, A. (2025). Flexible Swarm Learning May Outpace Foundation Models in Essential Tasks. arXiv:2510.06349

Automated Scientific Discovery

Lu, C. et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292
Yamada, Y. et al. (2025). The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search. arXiv:2504.08066
Boiko, D. A. et al. (2023). Autonomous Chemical Research with Large Language Models. arXiv:2304.05332

Evolutionary LLM Code Search

Romera-Paredes, B. et al. (2023). Mathematical Discoveries from Program Search with Large Language Models (FunSearch). Nature 625, 468–475
Lehman, J. et al. (2022). Evolution through Large Models. arXiv:2206.08896
Chen, A. et al. (2023). EvoPrompting: Language Models for Code-Level Neural Architecture Search. arXiv:2302.14838

ML Research Agents

Weco AI (2025). AIDE: AI-Driven Exploration in the Space of Code. arXiv:2502.13138
Shinn, N. et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366

Multi-Agent Reasoning

Du, Y. et al. (2023). Improving Factuality and Reasoning in Language Models through Multiagent Debate. arXiv:2305.14325
Zhou, A. et al. (2023). Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models. arXiv:2310.04406

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.brev		.brev
.github/workflows		.github/workflows
autoresearch		autoresearch
docs		docs
knowledge		knowledge
scripts		scripts
src/agentsciml		src/agentsciml
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
swarm.yaml		swarm.yaml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgenticSciML

How It Works

Per-Mutation Agent Pipeline

Swarm Cost Optimization

Quick Start

Architecture

Agent Roles

Key Design Decisions

Swarm Orchestration

Meta-Architecture Optimization

Application Areas

Writing a Project Adapter

Deployment

Development

References

Core

Automated Scientific Discovery

Evolutionary LLM Code Search

ML Research Agents

Multi-Agent Reasoning

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgenticSciML

How It Works

Per-Mutation Agent Pipeline

Swarm Cost Optimization

Quick Start

Architecture

Agent Roles

Key Design Decisions

Swarm Orchestration

Meta-Architecture Optimization

Application Areas

Writing a Project Adapter

Deployment

Development

References

Core

Automated Scientific Discovery

Evolutionary LLM Code Search

ML Research Agents

Multi-Agent Reasoning

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages