Conversation
Co-authored-by: gitosaurus <6794831+gitosaurus@users.noreply.github.com>
…mate-robot-instructions
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #657 +/- ##
=======================================
Coverage 64.50% 64.50%
=======================================
Files 58 58
Lines 5666 5666
=======================================
Hits 3655 3655
Misses 2011 2011 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
On this PR I need you to update the robot instructions with feedback and organize them appropriately! We'd like:
Before starting, please review and incorporate: |
|
First review the guidelines for writing claude skills: https://code.claude.com/docs/en/skills Please implement the following claude skills in a sub-PR, using the robot instructions on this PR to help guide you:
Rationale: Comments on PR #635 suggest "Many of these sections feel like they should be claude skills" and robots get "bogged down with the amount of dev tooling." This skill would provide structured guidance for the dev setup → code → test → commit workflow.
Rationale: Complex testing setup with slow/fast markers, parallel execution, and specific timeout requirements. This is a discrete, repeatable task that causes confusion.
Rationale: Well-defined patterns for adding models, datasets, and verbs with specific decorator/registration requirements. This is a common, structured task.
Rationale: Feedback emphasizes config complexity - all keys need defaults, key=false means None, ConfigDict vs dict, Pydantic validation. This is a common source of errors. |
Click here to view all benchmarks. |
| ## Quick Reference | ||
|
|
||
| **Project essentials:** | ||
| - Python 3.9+ with PyTorch, TOML config, CLI-first (`hyrax` command with verbs) |
| 5. **Vector Search**: Building vector databases with inference results for rapid similarity search and outlier detection | ||
|
|
||
| ### Technology Stack | ||
| - **Language**: Python >= 3.9 (target 3.9 for compatibility) |
There was a problem hiding this comment.
Pull request overview
Adds consolidated developer/AI-assistant documentation for Hyrax and updates agent-specific instruction files to reference a single canonical guide.
Changes:
- Added a new repository-wide development/architecture guide (
HYRAX_GUIDE.md). - Added Claude-specific guidance file (
CLAUDE.md) that links to the canonical guide. - Reworked
.github/copilot-instructions.mdto be Copilot-specific and defer details toHYRAX_GUIDE.md.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 11 comments.
| File | Description |
|---|---|
HYRAX_GUIDE.md |
Introduces a canonical development + architecture reference intended to be shared across assistant instruction sets. |
CLAUDE.md |
Adds Claude Code guidance and links back to HYRAX_GUIDE.md for most project details. |
.github/copilot-instructions.md |
Replaces the long embedded instructions with a Copilot-focused version that points to HYRAX_GUIDE.md. |
| # Environment setup | ||
| conda create -n hyrax python=3.10 && conda activate hyrax | ||
| git clone https://github.com/lincc-frameworks/hyrax.git && cd hyrax |
There was a problem hiding this comment.
The environment setup snippet uses Python 3.10 (conda create -n hyrax python=3.10), but the project requires Python >=3.11 (pyproject.toml). Please bump the suggested Python version to a supported one (and keep it consistent with CI’s 3.11–3.13 matrix).
| 5. **Vector Search**: Building vector databases with inference results for rapid similarity search and outlier detection | ||
|
|
||
| ### Technology Stack | ||
| - **Language**: Python >= 3.9 (target 3.9 for compatibility) |
There was a problem hiding this comment.
The guide states "Python >= 3.9 (target 3.9)", but the repository currently requires Python >=3.11 (see pyproject.toml requires-python = ">=3.11") and CI tests on 3.11–3.13. Please update the stated supported/target Python versions to match the project configuration.
| - **Language**: Python >= 3.9 (target 3.9 for compatibility) | |
| - **Language**: Python >= 3.11 (tested on 3.11–3.13; target 3.11 for compatibility) |
| # Create and activate virtual environment | ||
| conda create -n hyrax python=3.10 | ||
| conda activate hyrax |
There was a problem hiding this comment.
The setup snippet creates a conda env with Python 3.10, but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Update the suggested environment Python version so developers don't end up with an unsupported interpreter.
| - **TOML-based hierarchical configuration** with strong validation via Pydantic schemas | ||
| - **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config | ||
| - **ConfigDict** enforces all keys must have defaults (prevents silent config bugs) |
There was a problem hiding this comment.
This section refers to a project-specific "ConfigDict" that enforces defaults/immutability, but there is no such Hyrax config type in src/hyrax/config_utils.py; runtime config handling is dict-based, and _validate_runtime_config logs warnings (it doesn't enforce). Please adjust this description to match the actual config behavior (and, if desired, point to the Pydantic ConfigDict usage in src/hyrax/config_schemas/).
| - **TOML-based hierarchical configuration** with strong validation via Pydantic schemas | |
| - **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config | |
| - **ConfigDict** enforces all keys must have defaults (prevents silent config bugs) | |
| - **TOML-based hierarchical configuration** with validation via Pydantic schemas (see `src/hyrax/config_schemas/`) | |
| - **ConfigManager** merges: `hyrax_default_config.toml` + external library configs + user runtime config | |
| - **Runtime configs are plain Python dicts**; `_validate_runtime_config` logs warnings for unexpected/missing keys but does not enforce strict immutability or required-defaults |
|
|
||
| ### Important Architectural Conventions | ||
|
|
||
| 1. **Immutable Config**: ConfigDict prevents runtime mutations; all keys must have defaults |
There was a problem hiding this comment.
The "Immutable Config" convention claims "ConfigDict prevents runtime mutations", but Hyrax’s runtime config is a mutable dict and ConfigManager._resolve_config_paths() explicitly mutates it. Please update this convention to reflect the current behavior (e.g., warn against relying on mutations, or document what is and isn’t mutated).
| 1. **Immutable Config**: ConfigDict prevents runtime mutations; all keys must have defaults | |
| 1. **Config Behavior**: ConfigDict validates and provides defaults, but the runtime config is a mutable dict. Treat configs as read-mostly in verbs and library code; only documented helpers (e.g., path resolution in the ConfigManager) should mutate them. All keys must still have defaults. |
| - Use `ConfigDict` instead of regular dict to catch missing defaults at runtime | ||
| - All config keys MUST have defaults in `hyrax_default_config.toml` | ||
| - Config is immutable after creation - no runtime mutations allowed |
There was a problem hiding this comment.
This section mentions using a Hyrax-specific ConfigDict and that the config is immutable, but Hyrax’s runtime config handling in src/hyrax/config_utils.py is dict-based and includes explicit mutations (e.g., path resolution). Please update these bullets to describe the actual config validation/mutation behavior, or clarify that you mean Pydantic’s ConfigDict in config_schemas (which is different).
| - Use `ConfigDict` instead of regular dict to catch missing defaults at runtime | |
| - All config keys MUST have defaults in `hyrax_default_config.toml` | |
| - Config is immutable after creation - no runtime mutations allowed | |
| - Pydantic `ConfigDict` in `config_schemas` is used for schema definition and validation | |
| - Runtime config objects are regular dicts loaded from TOML and **are** mutated by `config_utils` (e.g., path resolution, applying defaults) | |
| - All user-facing config keys should have sensible defaults in `hyrax_default_config.toml`; avoid ad-hoc mutations outside the config utilities |
| ### Configuration System Pitfalls | ||
|
|
||
| - **Use ConfigDict, not dict**: ConfigDict catches missing defaults at runtime | ||
| - **All keys need defaults**: Add to `src/hyrax/hyrax_default_config.toml` | ||
| - **Config is immutable**: No runtime mutations allowed after creation | ||
| - **Pydantic validation**: Use schemas in `src/hyrax/config_schemas/` for validation |
There was a problem hiding this comment.
This document repeatedly refers to a Hyrax ConfigDict that enforces defaults and immutability, but Hyrax runtime config is dict-based (src/hyrax/config_utils.py) and _resolve_config_paths() mutates the config; missing defaults are warned on (not enforced). Please revise these bullets/sections to match the current implementation (and avoid confusing Pydantic’s ConfigDict with a project config type).
| - Subclasses auto-register via `__init_subclass__` | ||
| - Must provide metadata interface (fields, catalog data) | ||
| - `HyraxImageDataset` mixin provides transform stacking via `_update_transform()` | ||
| - Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataSet, InferenceDataSet |
There was a problem hiding this comment.
The built-in dataset list includes "HyraxCifarDataSet", but the implementation is HyraxCifarDataset (see src/hyrax/data_sets/hyrax_cifar_dataset.py). Please fix the class name here to avoid sending readers to a non-existent symbol.
| - Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataSet, InferenceDataSet | |
| - Built-in: HSCDataSet, LSSTDataset, FitsImageDataSet, HyraxCifarDataset, InferenceDataSet |
| Hyrax is a low-code Python tool for machine learning in astronomy. Key facts: | ||
| - **Tech Stack**: Python 3.9+, PyTorch, TOML configuration, CLI-first design | ||
| - **Main Workflows**: Data download → Training → Inference → Visualization → Vector search | ||
| - **Entry Point**: `hyrax` CLI with verb-based commands (train, infer, umap, visualize, etc.) | ||
| - **Configuration**: TOML files with Pydantic validation, hierarchical merging | ||
| - **Testing**: pytest with `@pytest.mark.slow` for long tests, parallel execution with `-n auto` |
There was a problem hiding this comment.
Quick reference says "Python 3.9+", but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Please update the stated supported Python versions to match the project configuration.
| **Project essentials:** | ||
| - Python 3.9+ with PyTorch, TOML config, CLI-first (`hyrax` command with verbs) | ||
| - Workflows: Data download → Training → Inference → Visualization → Vector search | ||
| - Plugin architecture: Models, datasets, and verbs auto-register via decorators | ||
| - Configuration: TOML files with Pydantic validation, hierarchical merging | ||
|
|
There was a problem hiding this comment.
Quick reference says "Python 3.9+", but pyproject.toml requires Python >=3.11 and CI runs 3.11–3.13. Please update the stated Python version support here to avoid misleading Copilot suggestions.
|
@mtauraso -- I received the review-request email; but I wasn't sure whether this is ready or whether you were still looking to make changes. But happy to take a look whenever you think you are done. Some of the co-pilot suggestions (in it's own review of the PR) look reasonable to me. |
Create HYRAX_GUIDE.md as the canonical shared reference, CLAUDE.md for Claude Code, and rewrite .github/copilot-instructions.md for Copilot. This deduplicates content and fixes inaccuracies identified in PRs #635, #656, and #657: Python version (>=3.11), ConfigDict (Pydantic's, not custom), verbs (internal only), primary interface (notebooks), config philosophy (three-tier "Configuration OR Code"), manifest files (compromise, not design goal), changelogs (none), Pydantic scope (data_request only), and HyraxCifarDataset spelling. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
No description provided.