diff --git a/CHANGELOG.md b/CHANGELOG.md index ad006f4..118c80c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,10 @@ Contributors add user-facing entries under `[Unreleased]` in the same PR. Mainta ## [Unreleased] +### Documentation + +- Clarified that bundle tests must mock network calls and model downloads in CI. + ## [0.3.7] - 2026-06-22 ### Added diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f569793..9f5b639 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -257,6 +257,7 @@ The primary guide for the host LLM. - Ships inside the skill bundle via `pip install skillware`. - Run: `pytest skills///test_skill.py` - Optional extra depth for maintainers: `tests/skills//test_.py` — see [TESTING.md](docs/TESTING.md). +- Mock network calls and first-run model downloads in bundle tests. ### Packaging (PyPI and `pip install`) diff --git a/docs/TESTING.md b/docs/TESTING.md index 5660a84..d7b08a8 100644 --- a/docs/TESTING.md +++ b/docs/TESTING.md @@ -34,6 +34,9 @@ pip install -r requirements.txt - Offline and mockable: manifest consistency, validation, deterministic `execute()` paths — no live network. - Run locally: `pytest skills///test_skill.py` or `pytest skills/`. - Install packages from the skill's `manifest.yaml` `requirements` when they are not already satisfied by `[all]`. +- Bundle tests run in CI on every pull request and must not make live HTTP requests, use API keys, or download models. +- Mock HTTP clients, LLM clients, embedding loaders, and model download paths such as HuggingFace, Ollama, `fastembed`, and similar integrations. +- Real inference belongs in maintainer tests under `tests/skills/` or in local/manual runs, not in bundle CI gates. ### Framework test diff --git a/docs/contributing/ai_native_workflow.md b/docs/contributing/ai_native_workflow.md index f86c223..b731b0a 100644 --- a/docs/contributing/ai_native_workflow.md +++ b/docs/contributing/ai_native_workflow.md @@ -262,6 +262,7 @@ Complete the checklist that matches your issue during Stage 5. - [ ] `instructions.md`: when to use, how to interpret output, limitations - [ ] `card.json`: `issuer` matches manifest - [ ] `test_skill.py` (bundle test) passes — `pytest skills///test_skill.py` +- [ ] Bundle tests mock all network calls and model downloads; CI does not download models. - [ ] `docs/skills/.md` and catalog row in `docs/skills/README.md` - [ ] **Usage Examples** on the catalog page (all five providers per [skill usage template](../usage/skill_usage_template.md)); link to `docs/usage/` and list skill `env_vars` without duplicating [api_keys.md](../usage/api_keys.md) - [ ] `pytest tests/test_skill_issuer.py` passes diff --git a/templates/python_skill/README.md b/templates/python_skill/README.md index b9aea12..3e60bda 100644 --- a/templates/python_skill/README.md +++ b/templates/python_skill/README.md @@ -10,7 +10,7 @@ Starter bundle under `skills///`. Copy this template from 4. **`skill.py`**: Implement deterministic logic; no LLM-generated code in the skill body. 5. **`instructions.md`**: Tell the agent when and how to use the tool. 6. **`card.json`**: Mirror `issuer` from the manifest; customize UI fields. -7. **`test_skill.py`**: Bundle test (required); offline, mock externals; run `pytest skills///test_skill.py`. See [TESTING.md](../../docs/TESTING.md). +7. **`test_skill.py`**: Bundle test (required); offline, mock external services, including HTTP clients, LLM APIs, embedding/model loaders, and any first-run model downloads; run `pytest skills///test_skill.py`. See [TESTING.md](../../docs/TESTING.md). 8. **`docs/skills/.md`**: Catalog page with **ID**, **Issuer**, and **Usage Examples** (all providers; see `docs/usage/skill_usage_template.md`). 9. **`docs/skills/README.md`**: Add a row to the skill library table.