Add new-format network_expansion templater by nick-gorman · Pull Request #102 · Open-ISP/ISPyPSA

nick-gorman · 2026-05-07T03:53:33Z

Summary

Adds a new-format network_expansion templater that turns IASR flow-path and REZ
augmentation tables into two ISPyPSA inputs: network_expansion_options (selected
least-cost option per expandable element) and network_transmission_path_expansion_costs
(long-form $/MW cost trajectory).
Wires the templater into create_ispypsa_inputs_template behind the existing
use_new_table_format feature flag, with a granularity-aware filter that drops or
re-keys augmentation entries when paths are aggregated to NEM regions / single region.
Extends _template_network_transmission to inject zero-capacity parallel paths for
augmentation corridors (e.g. CNSW-SNW) that exist alongside suffixed siblings
(CNSW-SNW_NTH/_STH) — without this, the orchestrator misclassifies them as
constraint relaxations.
Partitions the IASR table cache by workbook version on disk (6.0/, 7.5/) and
drives required-table discovery from a checked-in known_tables.yaml manifest so the
augmentation prefixes can be enumerated per version.
Schema updates: network_expansion_options now keys on (expansion_id, expansion_type)
with allowed values forward/reverse/constraint_relaxation; cost-per-MW divisor
documented as max(forward, reverse).
Supporting changes: env-var override for feature flags so subprocess CLI tests can flip
them, _financial_year_string_to_end_year_int helper, deduplicated fuzzy-match log
lines, and CLAUDE.md additions covering I/O-example docstrings, integration-test
scope, and the "no hidden preconditions" rule.

Example output

network_expansion_options — physical paths emit forward+reverse rows; constraint
groups (ids not in network_transmission_paths) emit one constraint_relaxation row:

expansion_id  expansion_type          allowed_expansion  expansion_option
CQ-NQ         forward                 1000               Option 1   # asymmetric path: forward MW from selected option
CQ-NQ         reverse                 1200               Option 1   # ...and reverse MW from same option (least $/MW)
DN1-CNSW      forward                 500                Option 2a  # source had reverse_mw = NaN
DN1-CNSW      reverse                 0                  Option 2a  # NaN -> 0 (option provides no expansion this direction)
N1-NNSW       forward                 1660               Option 1   # REZ path: symmetric (forward == reverse)
N1-NNSW       reverse                 1660               Option 1
SWQLD1        constraint_relaxation   330                Option 1   # constraint group: single row, not in network_transmission_paths

network_transmission_path_expansion_costs — long-form, $/MW of the larger directional
capacity:

expansion_id  year  cost
CQ-NQ         2025  416666.67    # 500M / max(1000, 1200) = 500M / 1200 (asymmetric -> divisor is the larger side)
CQ-NQ         2026  425000.00    # next-year cost / same divisor (escalation visible across years)
N1-NNSW       2025  3539566.27   # 5.875B / 1660 (symmetric -> unambiguous divisor)
N1-NNSW       2026  3593401.81
SWQLD1        2025  1515.15      # 500k / 330 (constraint group: divisor is its own capacity)

Where the changes live

src/ispypsa/
├── feature_flags.py                            # env-var overrides for subprocess tests
├── cli/dodo.py                                 # version-aware cache target list
├── iasr_table_caching/
│   ├── known_tables.yaml                       # NEW — per-version table manifest
│   └── local_cache.py                          # version-partitioned cache, prefix-driven aug discovery
├── templater/
│   ├── create_template.py                      # new-format branch wires in network_expansion
│   ├── network_expansion.py                    # NEW — orchestrator + helpers (~970 lines)
│   ├── transmission.py                         # _append_new_parallel_paths + flow_path_options arg
│   └── helpers.py                              # _financial_year_string_to_end_year_int, dedup'd fuzzy log
└── validation/schemas/
    ├── network_expansion_options.yaml          # expansion_type column, composite uniqueness
    └── network_transmission_path_expansion_costs.yaml  # cost-per-MW divisor doc

tests/
├── test_workbook_table_cache/
│   ├── 6.0/                                    # existing fixtures, moved
│   └── 7.5/                                    # NEW — fixtures for new-format pathway
├── test_templater/
│   ├── test_network_expansion.py               # NEW — ~870 lines, per-helper coverage
│   ├── test_transmission.py                    # parallel-path wiring
│   └── test_create_ispypsa_inputs_template.py  # integration wiring
├── test_cli/
│   ├── cli_test_helpers_new_table_formats.py   # NEW
│   └── test_create_ispypsa_inputs_new_table_formats.py  # NEW — end-to-end CLI run
└── test_iasr_table_caching/test_local_cache.py # version-partitioned cache assertions

scripts/build_75_test_cache.py                  # NEW — one-off to regenerate 7.5 fixtures
CLAUDE.md                                       # I/O example, integration test, hidden-precondition rules

Extends the new-format templater with two output tables: network_expansion_options and network_transmission_path_expansion_costs. A single expansion_id keyed by expansion_type (forward / reverse / constraint_relaxation) unifies physical paths and constraint-group relaxations under one schema, so downstream consumers don't have to join back to network_transmission_paths to classify rows. Option selection picks the lowest dollars-per-MW per expansion_id using the first year with complete costs. Cost is divided by max(forward, reverse) so an asymmetric option can be represented in the translator as a single extendable PyPSA Link. Known-table discovery for the local cache moves from hard-coded lists to a static manifest (known_tables.yaml), so augmentation tables can be enumerated by prefix — necessary because v7.5 has one table per flow path x scenario. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

At nem_regions and single_region the network paths table is aggregated before the expansion templater runs, so flow-path augmentation entries keyed by raw IASR sub-region path IDs no longer line up with the surviving paths. Drop intra-region entries and re-key cross-region ones (NNSW-SQ -> NSW-QLD, suffixes preserved); at single_region drop all flow-path augmentations entirely. REZ and constraint-group entries are unaffected: REZ entries remap automatically via the geo-to-path lookup built from the already-aggregated paths, and constraint groups stay valid at all granularities since they can still bite on REZ-to- region lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Audited the file against the CLAUDE.md rules that landed on new-format-network-tables and applied the same treatment we did to the transmission tests: full-DataFrame assertions instead of row-count / set membership / iloc / pd.isna probes, full log lines instead of marker + per-name any() checks, and csv_str_to_df for empty expected outputs. Also dropped four of the five DataFrame-builder helpers (_fp_costs, _rez_options, _rez_costs, _paths_table) — they hid only short, non-private column lists, and inlining via csv_str_to_df reads more consistently with the rest of the file. _fp_options is kept because its column list pulls in two private constants from the source module that csv_str_to_df can't reference; a file-level docstring records the rationale. While reviewing test_first_year_with_complete_costs_warns_..., the source warning was too terse ("No year with complete costs for 'X'; skipping.") to be useful in a real run. Replaced with a message that names the failure mode and the likely cause. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same audit pass as the test_network_expansion.py cleanup, applied to the three new-format integration tests: - Drop redundant `assert "..." in result` lines from _new_format; the immediately-following `result["..."]` access raises KeyError with the same diagnostic and is consistent with the other two integration tests. - Drop three module-level column constants (_FP_AUG_COST_COLS, _REZ_AUG_OPTION_COLS, _REZ_AUG_COST_COLS) and the pd.DataFrame([(...)], columns=...) input pattern that depended on them; inline as csv_str_to_df instead. Keep _FP_AUG_OPTION_COLS, which still pulls two private constants from the source module. Left the `set(expansion["expansion_id"]) == {...}` content checks in the nem_regions and single_region integration tests in place: they pin the intersection of granularity, REZ remapping, and augmentation filtering at the orchestrator level — worth the slight maintenance cost over the strict "presence + columns + row count" rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The unit test for _new_parallel_path_rows in test_network_expansion.py exercised the helper's content but no integration test triggered the un-suffixed-corridor branch via create_ispypsa_inputs_template. A refactor that dropped the _append_new_parallel_paths(...) call from create_template.py would have passed all tests. Extend test_create_ispypsa_inputs_template_new_format with two suffixed siblings (CNSW-SNW (NTH), (STH)) in flow_path_transfer_capability plus an un-suffixed CNSW-SNW augmentation. The new `assert "CNSW-SNW" in set(paths["path_id"])` is the load-bearing check — its comment names _append_new_parallel_paths so a regression failure points straight at the broken wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

_log_fuzzy_match emitted one log line per row of the input series, so when callers passed e.g. one cost row per year for the same option name, each name-matching decision was logged N times. In a real run with several years × dozens of expansions, this produced hundreds of redundant lines that masked the actually-distinct decisions. Dedup with sorted(set(zip(...))) so each (original, match) decision appears exactly once. The CLAUDE.md exception that lets fuzzy matching log per-decision (rather than as a summary) is preserved — one line per distinct decision is the audit unit, not one line per row. Sorted output also gives stable ordering across runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous wording ("aggregate individual row contents into a sorted list") was being read as a blanket rule. The actual concern was redundant firings of one logical event (e.g. once per year per option), not per-decision logs where each line is a distinct audit point. Recast the rule around that, with the existing fuzzy-match log promoted from "exception" to canonical example of the per-decision pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two issues from a code-review pass. The parallel-path append (for augmentation corridors with no matching existing path, e.g. CNSW-SNW alongside CNSW-SNW_NTH/_STH) used to live in create_template.py, where its position in the call sequence was an implicit contract: if reordered, those corridors would silently misclassify as constraint groups in the expansion output. Moved the append into _template_network_transmission so the contract is enforced where the paths are built. Pulled the design rationale (corridor-keyed augmentations, why a synthetic third Link, why explicit zero capacity) into the docstring of _append_new_parallel_paths in its new home. _build_geo_from_to_path_id_map collapsed duplicate geo_from values for subregions, relying on the implicit guarantee that REZ option tables never contain subregion IDs — a hidden precondition. Threaded rez_ids through _template_network_expansion so the map is built from REZ rows only. The collision becomes structurally impossible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A multi-agent docstring sweep flagged four examples that had drifted from the code or under-documented their outputs: - _aggregate_to_nem_regions: the example dropped the N1-NSW NaN row from the returned limits, even though _remap_limit_path_ids keeps any row whose path_id is in the rename map. - _append_new_parallel_paths: the rationale spends a paragraph on why limits are explicit zeros, not NaN — but the example only showed the paths half of the returned tuple. Added the limits side so the zero-capacity rows are visible alongside the rationale. - _template_network_expansion: the example didn't include the new required rez_ids input, so a reader couldn't reproduce it. - _aggregate_flow_path_augmentations_to_nem_regions: shown only as set notation over keys, hiding the dict-of-DataFrames structure and the "Flow path" column rewrite. Aligned with the wrapper's format. Also fixed a stray double-space in the def line of the same function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds direct tests for utilities and branches that were only covered end-to-end (or not at all): the IASR-prefix typo absorption, the no-numeric-capacity INFO log, em-dash alignment, earliest-complete-year selection, the unknown-granularity ValueError, and the small parsing helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The use_new_table_format=true path had no test driving it via the CLI, which is the surface most likely to regress when AEMO updates the 7.5 workbook or the templater logic shifts. Added a parameterised CLI test over all three regional granularities that asserts row counts derived from named structural quantities (flow paths, REZs, parallel-path injections, REZs without limits, etc.) plus referential integrity between paths/limits and options/costs. To keep CI off the workbook binary, the input fixture is committed as parsed CSVs at tests/test_workbook_table_cache/7.5/. The existing 6.0 truncated fixture was moved into a sibling 6.0/ subdir so the two versions sit alongside each other and serve different purposes (truncated unit-test inputs vs full e2e inputs) without aliasing. Flag flips for subprocess CLI tests need to cross the process boundary, so feature_flags.py now honours an ISPYPSA_USE_NEW_TABLE_FORMAT env override on top of the YAML default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The two tests had identical inputs and exercised the same code path; collapsing them follows the combined output-plus-log pattern already used elsewhere in this file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

list_templater_output_files returned the old-format file names regardless of feature flag, so for new-format runs doit's target list pointed at files that don't exist. The task therefore re-ran on every invocation, silently masking the cache-skipping behaviour. Added a feature-flag branch returning _NEW_FORMAT_TEMPLATE_OUTPUTS for new-format runs. The bug was discovered by extending the new-format CLI test to do a second invocation and assert up-to-date detection. That assertion lives in a new mechanism test sibling to the existing 6.0 mechanism test — test_create_ispypsa_inputs_task_new_format — covering the same fresh-run / up-to-date / config_changed / extensive-trigger flow against the new-format CLI path. The new-format coverage is split into a parallel test file rather than parameterising the existing tests. Trades some duplication during the transition for a cleaner handover when 6.0 is dropped: the legacy file is deleted and the new-format file is renamed in place, no diffs inside test bodies. Helpers are split the same way — format-agnostic infrastructure (run_cli_command, build_mock_config, etc.) stays shared in cli_test_helpers.py; the 7.5-specific fixtures live in cli_test_helpers_new_table_formats.py. Step-by-step handover plan is documented in the new test file's module docstring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-07T03:57:00Z

Codecov Report

❌ Patch coverage is 96.38554% with 9 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/ispypsa/feature_flags.py	66.66%	1 Missing and 1 partial ⚠️
src/ispypsa/iasr_table_caching/local_cache.py	84.61%	2 Missing ⚠️
src/ispypsa/templater/create_template.py	86.66%	1 Missing and 1 partial ⚠️
src/ispypsa/templater/network_expansion.py	98.93%	1 Missing and 1 partial ⚠️
src/ispypsa/templater/transmission.py	95.65%	0 Missing and 1 partial ⚠️

Files with missing lines	Coverage Δ
src/ispypsa/templater/helpers.py	`98.13% <100.00%> (+0.01%)`	⬆️
src/ispypsa/templater/transmission.py	`98.73% <95.65%> (+0.18%)`	⬆️
src/ispypsa/feature_flags.py	`81.81% <66.66%> (-18.19%)`	⬇️
src/ispypsa/iasr_table_caching/local_cache.py	`74.28% <84.61%> (+5.32%)`	⬆️
src/ispypsa/templater/create_template.py	`89.18% <86.66%> (-1.44%)`	⬇️
src/ispypsa/templater/network_expansion.py	`98.93% <98.93%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adds FEATURE_FLAG_CLEANUP[use_new_table_format] markers at every site that will need attention when the feature flag is retired. A single grep across the repo will surface the full removal checklist instead of relying on recall of where the gating lives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

nick-gorman and others added 13 commits May 6, 2026 11:10

Merge non-numeric capacity skip + log tests

9a322ab

The two tests had identical inputs and exercised the same code path; collapsing them follows the combined output-plus-log pattern already used elsewhere in this file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

nick-gorman requested a review from EllieKallmier May 13, 2026 05:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new-format network_expansion templater#102

Add new-format network_expansion templater#102
nick-gorman wants to merge 14 commits into
mainfrom
new-format-network-expansion

nick-gorman commented May 7, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nick-gorman commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example output

Where the changes live

Uh oh!

codecov Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nick-gorman commented May 7, 2026 •

edited

Loading

codecov Bot commented May 7, 2026 •

edited

Loading