A toolkit for reproducible earth science analysis workflows.
This is an adaptation of GitHub's Spec Kit for earth science research. The original is designed for software development-building apps, features, and products. This version is designed for building reproducible analysesβfrom research question to publication-ready figures.
Most earth scientists aren't trained as software developers, but we write a lot of code. AI coding assistants are making this easierβand riskier. They're fast, but they make mistakes. Subtle ones. The kind that end up in your results if you're not careful.
"Vibe coding" (just prompting and hoping) is fun for side projects, but scientific analysis needs to be correct, documented, and reproducible. That's where this toolkit comes in.
Early stage: This is actively being developed and hasn't been tested on many real projects yet. Feedback is very welcome - open an issue or reach out if you try it and have thoughts.
Science Spec-kit gives you a structured way to build analysis code, whether you're writing it yourself, working with an AI assistant, or both. Before you write any code, you write down:
- What question you're trying to answer
- What data you're using
- What outputs you expect
- How you'll know the results are correct
Then you build the code step by step, with checkpoints along the way.
Everything is written in plain English. Your collaborators and reviewers can understand your analysis plan without reading Python. Non-programmers can review your approach, catch logical errors, and understand exactly what the code is supposed to do.
Mistakes happen in science. There's no way around that. But the goal is to:
- Catch mistakes earlier by thinking through the approach before coding
- Make mistakes easier to find by logging every decision and change
- Make reviews more thorough because reviewers can understand intentions, not just code
This isn't just for AI-assisted coding. It's useful for anyone who wants to organize their thoughts before starting. But if you are using an AI assistant and want quality results, you need structure to keep things on track.
Instead of diving straight into code, you work through a sequence:
- Specify the research question and expected outputs upfront
- Plan the data pipeline before writing code
- Implement iteratively with built-in QC checkpoints
- Verify reproducibility before sharing results
The specification becomes the source of truth, not an afterthought.
From the constitution template:
- Reproducibility: Analysis runs from raw data to outputs without manual intervention
- Data Integrity: Raw data is immutable; transformations produce new files
- Provenance: Every output traces back to code, data, and parameter choices
uv tool install science-specify-cli --from git+https://github.com/Waveform-Analytics/science-spec-kit.gitscience-specify init my-analysis --ai claude
cd my-analysis/speckit.constitutionDefine your project's data sources, technical environment, coordinate systems, and standards. You can provide context inline (e.g., /speckit.constitution My project analyzes glacier velocities using ITS_LIVE data...) or run the command by itself and answer prompts interactively.
Describe your research goal in plain language. Here's an example:
/speckit.specify I want to analyze seasonal velocity variations for glaciers in the study region, comparing winter vs summer patterns and identifying any acceleration trends over the 2015-2023 period./speckit.plan Using Python with xarray for NetCDF handling, scipy for statistics. Data is on local NAS at /data/velocities/. Running on laptop, ~20GB total data./speckit.tasks/speckit.implementWrite scripts, review, run, debug, repeat.
/speckit.checklist| Command | Purpose |
|---|---|
/speckit.constitution |
Define project standards, data sources, and principles |
/speckit.specify |
Create analysis specification from research goal |
/speckit.plan |
Design data pipeline and script structure |
/speckit.tasks |
Generate task list organized by pipeline stage |
/speckit.implement |
Iteratively write, run, and debug scripts |
| Command | Purpose |
|---|---|
/speckit.clarify |
Resolve ambiguities in the specification |
/speckit.analyze |
Check consistency across spec, plan, and tasks |
/speckit.checklist |
Generate reproducibility checklist |
After initialization:
my-analysis/
βββ memory/
β βββ constitution.md # Project standards and data sources
βββ specs/
β βββ 001-analysis-name/
β βββ spec.md # Analysis specification
β βββ plan.md # Data pipeline plan
β βββ tasks.md # Task breakdown
β βββ research.md # Method decisions
βββ scripts/ # Analysis scripts
βββ data/
β βββ raw/ # Immutable raw data
β βββ processed/ # Transformed data
β βββ intermediate/ # Working files
βββ outputs/
βββ figures/
βββ tables/
Tasks are organized by stage:
- Setup - Environment, dependencies, directory structure
- Data Acquisition - Download/access raw data
- Preprocessing - Clean, transform, filter
- Analysis - Core calculations
- Visualization - Figures, tables
- Documentation - README, reproducibility verification
Each stage has QC checkpoints before proceeding.
/speckit.implement supports an iterative cycle:
- Write script based on task
- Review - user adds inline comments (
[Q: ...],[C: ...],[TODO: ...]) - Incorporate feedback
- Run script
- Debug if needed
- Complete task and move to next
This matches how scientific analysis actually worksβyou learn as you go.
Works with any agent supporting slash commands:
- Claude Code
- Cursor
- GitHub Copilot
- Gemini CLI
- And many others
- uv for package management
- Python 3.11+
- Git
- A supported AI coding agent
Science Spec Kit is adapted from GitHub Spec Kit by Den Delimarsky and John Lam.
MIT License - see LICENSE.