Skip to content

Make the robots fix the robot instructions#657

Draft
mtauraso wants to merge 7 commits intomainfrom
mtauraso/ultimate-robot-instructions
Draft

Make the robots fix the robot instructions#657
mtauraso wants to merge 7 commits intomainfrom
mtauraso/ultimate-robot-instructions

Conversation

@mtauraso
Copy link
Collaborator

No description provided.

@codecov
Copy link

codecov bot commented Jan 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.50%. Comparing base (2740c3e) to head (fe2bbba).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #657   +/-   ##
=======================================
  Coverage   64.50%   64.50%           
=======================================
  Files          58       58           
  Lines        5666     5666           
=======================================
  Hits         3655     3655           
  Misses       2011     2011           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mtauraso
Copy link
Collaborator Author

@copilot

On this PR I need you to update the robot instructions with feedback and organize them appropriately!

We'd like:

  1. Integration of the feedback on Add design goals and north stars to Copilot instructions #656 and Robot Guidance #635 into this PR, including suggestions about hierarchical organization and unifying common content
  2. A file setup which is likely to work for both Github Copilot and Claude Code users, following best practices
  3. Strong preference toward deduplicating information, and incorporating appropriate content by reference

Before starting, please review and incorporate:
Best practices for claude.md: https://code.claude.com/docs/en/best-practices#write-an-effective-claude-md
Writing a good claude.md: https://www.humanlayer.dev/blog/writing-a-good-claude-md
Github copilot docs on copilot instructions: https://docs.github.com/en/copilot/how-tos/configure-custom-instructions/add-repository-instructions
Writing claude skills: https://code.claude.com/docs/en/skills

Copy link
Contributor

Copilot AI commented Jan 30, 2026

@mtauraso I've opened a new pull request, #658, to work on those changes. Once the pull request is ready, I'll request review from you.

@mtauraso
Copy link
Collaborator Author

@copilot

First review the guidelines for writing claude skills: https://code.claude.com/docs/en/skills

Please implement the following claude skills in a sub-PR, using the robot instructions on this PR to help guide you:

  1. Hyrax Development Workflow

Rationale: Comments on PR #635 suggest "Many of these sections feel like they should be claude skills" and robots get "bogged down with the amount of dev tooling." This skill would provide structured guidance for the dev setup → code → test → commit workflow.
Scope: Setup commands, validation workflow, pre-commit hooks, timeout expectations
Benefit: Reduces confusion about which tools to use when, prevents premature command cancellation

  1. Hyrax Testing Strategy

Rationale: Complex testing setup with slow/fast markers, parallel execution, and specific timeout requirements. This is a discrete, repeatable task that causes confusion.
Scope: When to use @pytest.mark.slow, parallel testing, fixture usage, network retry strategies
Benefit: Clearer testing approach, reduces test failures from premature cancellation

  1. Adding Hyrax Components

Rationale: Well-defined patterns for adding models, datasets, and verbs with specific decorator/registration requirements. This is a common, structured task.
Scope: Model addition (forward/train_step/prepare_inputs), dataset registration, verb creation with https://github.com/decorators
Benefit: Ensures correct implementation patterns, reduces registration issues

  1. Hyrax Configuration System

Rationale: Feedback emphasizes config complexity - all keys need defaults, key=false means None, ConfigDict vs dict, Pydantic validation. This is a common source of errors.
Scope: ConfigDict usage, default requirements, key=false convention, TOML structure, immutability rules
Benefit: Prevents "config key not found" errors, ensures proper config usage

Copy link
Contributor

Copilot AI commented Jan 30, 2026

@mtauraso I've opened a new pull request, #659, to work on those changes. Once the pull request is ready, I'll request review from you.

@github-actions
Copy link

Before [2740c3e] After [9aa732d] Ratio Benchmark (Parameter)
failed failed n/a data_cache_benchmarks.DataCacheBenchmarks.time_preload_cache_cifar
failed failed n/a data_cache_benchmarks.DataCacheBenchmarks.track_cache_cifar_hyrax_size_undercount
9.73±0.4ms 10.0±0.3ms 1.03 vector_db_benchmarks.VectorDBSearchBenchmarks.time_search_by_vector_many_shards(128, 'chromadb')
436±2ms 448±3ms 1.03 vector_db_benchmarks.VectorDBSearchBenchmarks.time_search_by_vector_many_shards(64, 'qdrant')
1.81±0.02s 1.83±0.01s 1.02 benchmarks.time_umap_help
37.2±0.3s 37.8±0.2s 1.02 vector_db_benchmarks.VectorDBInsertBenchmarks.time_load_vector_db(16384, 'qdrant')
1.81±0.01s 1.82±0.01s 1.01 benchmarks.time_database_connection_help
1.80±0.03s 1.82±0.01s 1.01 benchmarks.time_train_help
1.81±0.01s 1.84±0.02s 1.01 benchmarks.time_visualize_help
1.66G 1.67G 1.01 vector_db_benchmarks.VectorDBInsertBenchmarks.peakmem_load_vector_db(16384, 'chromadb')

Click here to view all benchmarks.

@mtauraso mtauraso marked this pull request as ready for review January 31, 2026 00:18
Copilot AI review requested due to automatic review settings January 31, 2026 00:18
## Quick Reference

**Project essentials:**
- Python 3.9+ with PyTorch, TOML config, CLI-first (`hyrax` command with verbs)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.11 +

5. **Vector Search**: Building vector databases with inference results for rapid similarity search and outlier detection

### Technology Stack
- **Language**: Python >= 3.9 (target 3.9 for compatibility)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.11 +

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds consolidated developer/AI-assistant documentation for Hyrax and updates agent-specific instruction files to reference a single canonical guide.

Changes:

  • Added a new repository-wide development/architecture guide (HYRAX_GUIDE.md).
  • Added Claude-specific guidance file (CLAUDE.md) that links to the canonical guide.
  • Reworked .github/copilot-instructions.md to be Copilot-specific and defer details to HYRAX_GUIDE.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 11 comments.

File Description
HYRAX_GUIDE.md Introduces a canonical development + architecture reference intended to be shared across assistant instruction sets.
CLAUDE.md Adds Claude Code guidance and links back to HYRAX_GUIDE.md for most project details.
.github/copilot-instructions.md Replaces the long embedded instructions with a Copilot-focused version that points to HYRAX_GUIDE.md.

Comment on lines +51 to 53
# Environment setup
conda create -n hyrax python=3.10 && conda activate hyrax
git clone https://github.com/lincc-frameworks/hyrax.git && cd hyrax
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment setup snippet uses Python 3.10 (conda create -n hyrax python=3.10), but the project requires Python >=3.11 (pyproject.toml). Please bump the suggested Python version to a supported one (and keep it consistent with CI’s 3.11–3.13 matrix).

Copilot uses AI. Check for mistakes.
5. **Vector Search**: Building vector databases with inference results for rapid similarity search and outlier detection

### Technology Stack
- **Language**: Python >= 3.9 (target 3.9 for compatibility)
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guide states "Python >= 3.9 (target 3.9)", but the repository currently requires Python >=3.11 (see pyproject.toml requires-python = ">=3.11") and CI tests on 3.11–3.13. Please update the stated supported/target Python versions to match the project configuration.

Suggested change
- **Language**: Python >= 3.9 (target 3.9 for compatibility)
- **Language**: Python >= 3.11 (tested on 3.11–3.13; target 3.11 for compatibility)

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +62
# Create and activate virtual environment
conda create -n hyrax python=3.10
conda activate hyrax
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setup snippet creates a conda env with Python 3.10, but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Update the suggested environment Python version so developers don't end up with an unsupported interpreter.

Copilot uses AI. Check for mistakes.
Comment on lines +162 to +164
- **TOML-based hierarchical configuration** with strong validation via Pydantic schemas
- **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config
- **ConfigDict** enforces all keys must have defaults (prevents silent config bugs)
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section refers to a project-specific "ConfigDict" that enforces defaults/immutability, but there is no such Hyrax config type in src/hyrax/config_utils.py; runtime config handling is dict-based, and _validate_runtime_config logs warnings (it doesn't enforce). Please adjust this description to match the actual config behavior (and, if desired, point to the Pydantic ConfigDict usage in src/hyrax/config_schemas/).

Suggested change
- **TOML-based hierarchical configuration** with strong validation via Pydantic schemas
- **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config
- **ConfigDict** enforces all keys must have defaults (prevents silent config bugs)
- **TOML-based hierarchical configuration** with validation via Pydantic schemas (see `src/hyrax/config_schemas/`)
- **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config
- **Runtime configs are plain Python dicts**; `_validate_runtime_config` logs warnings for unexpected/missing keys but does not enforce strict immutability or required-defaults

Copilot uses AI. Check for mistakes.

### Important Architectural Conventions

1. **Immutable Config**: ConfigDict prevents runtime mutations; all keys must have defaults
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Immutable Config" convention claims "ConfigDict prevents runtime mutations", but Hyrax’s runtime config is a mutable dict and ConfigManager._resolve_config_paths() explicitly mutates it. Please update this convention to reflect the current behavior (e.g., warn against relying on mutations, or document what is and isn’t mutated).

Suggested change
1. **Immutable Config**: ConfigDict prevents runtime mutations; all keys must have defaults
1. **Config Behavior**: ConfigDict validates and provides defaults, but the runtime config is a mutable dict. Treat configs as read-mostly in verbs and library code; only documented helpers (e.g., path resolution in the ConfigManager) should mutate them. All keys must still have defaults.

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +120
- Use `ConfigDict` instead of regular dict to catch missing defaults at runtime
- All config keys MUST have defaults in `hyrax_default_config.toml`
- Config is immutable after creation - no runtime mutations allowed
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section mentions using a Hyrax-specific ConfigDict and that the config is immutable, but Hyrax’s runtime config handling in src/hyrax/config_utils.py is dict-based and includes explicit mutations (e.g., path resolution). Please update these bullets to describe the actual config validation/mutation behavior, or clarify that you mean Pydantic’s ConfigDict in config_schemas (which is different).

Suggested change
- Use `ConfigDict` instead of regular dict to catch missing defaults at runtime
- All config keys MUST have defaults in `hyrax_default_config.toml`
- Config is immutable after creation - no runtime mutations allowed
- Pydantic `ConfigDict` in `config_schemas` is used for schema definition and validation
- Runtime config objects are regular dicts loaded from TOML and **are** mutated by `config_utils` (e.g., path resolution, applying defaults)
- All user-facing config keys should have sensible defaults in `hyrax_default_config.toml`; avoid ad-hoc mutations outside the config utilities

Copilot uses AI. Check for mistakes.
Comment on lines +102 to +107
### Configuration System Pitfalls

- **Use ConfigDict, not dict**: ConfigDict catches missing defaults at runtime
- **All keys need defaults**: Add to `src/hyrax/hyrax_default_config.toml`
- **Config is immutable**: No runtime mutations allowed after creation
- **Pydantic validation**: Use schemas in `src/hyrax/config_schemas/` for validation
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document repeatedly refers to a Hyrax ConfigDict that enforces defaults and immutability, but Hyrax runtime config is dict-based (src/hyrax/config_utils.py) and _resolve_config_paths() mutates the config; missing defaults are warned on (not enforced). Please revise these bullets/sections to match the current implementation (and avoid confusing Pydantic’s ConfigDict with a project config type).

Copilot uses AI. Check for mistakes.
- Subclasses auto-register via `__init_subclass__`
- Must provide metadata interface (fields, catalog data)
- `HyraxImageDataset` mixin provides transform stacking via `_update_transform()`
- Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataSet, InferenceDataSet
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The built-in dataset list includes "HyraxCifarDataSet", but the implementation is HyraxCifarDataset (see src/hyrax/data_sets/hyrax_cifar_dataset.py). Please fix the class name here to avoid sending readers to a non-existent symbol.

Suggested change
- Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataSet, InferenceDataSet
- Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataset, InferenceDataSet

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +14
Hyrax is a low-code Python tool for machine learning in astronomy. Key facts:
- **Tech Stack**: Python 3.9+, PyTorch, TOML configuration, CLI-first design
- **Main Workflows**: Data download → Training → Inference → Visualization → Vector search
- **Entry Point**: `hyrax` CLI with verb-based commands (train, infer, umap, visualize, etc.)
- **Configuration**: TOML files with Pydantic validation, hierarchical merging
- **Testing**: pytest with `@pytest.mark.slow` for long tests, parallel execution with `-n auto`
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick reference says "Python 3.9+", but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Please update the stated supported Python versions to match the project configuration.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +14
**Project essentials:**
- Python 3.9+ with PyTorch, TOML config, CLI-first (`hyrax` command with verbs)
- Workflows: Data download → Training → Inference → Visualization → Vector search
- Plugin architecture: Models, datasets, and verbs auto-register via decorators
- Configuration: TOML files with Pydantic validation, hierarchical merging

Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick reference says "Python 3.9+", but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Please update the stated Python version support here to avoid misleading Copilot suggestions.

Copilot uses AI. Check for mistakes.
@aritraghsh09
Copy link
Collaborator

@mtauraso -- I received the review-request email; but I wasn't sure whether this is ready or whether you were still looking to make changes. But happy to take a look whenever you think you are done. Some of the co-pilot suggestions (in it's own review of the PR) look reasonable to me.

mtauraso added a commit that referenced this pull request Feb 5, 2026
Create HYRAX_GUIDE.md as the canonical shared reference, CLAUDE.md for
Claude Code, and rewrite .github/copilot-instructions.md for Copilot.
This deduplicates content and fixes inaccuracies identified in PRs #635,
#656, and #657: Python version (>=3.11), ConfigDict (Pydantic's, not
custom), verbs (internal only), primary interface (notebooks), config
philosophy (three-tier "Configuration OR Code"), manifest files
(compromise, not design goal), changelogs (none), Pydantic scope
(data_request only), and HyraxCifarDataset spelling.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@mtauraso mtauraso marked this pull request as draft February 5, 2026 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments