Skip to content

Add network_transmission_path_limits templater#94

Merged
nick-gorman merged 5 commits into
mainfrom
new-format-network-tables
May 5, 2026
Merged

Add network_transmission_path_limits templater#94
nick-gorman merged 5 commits into
mainfrom
new-format-network-tables

Conversation

@nick-gorman
Copy link
Copy Markdown
Member

@nick-gorman nick-gorman commented Apr 21, 2026

Summary

  • Adds new table: network_transmission_path_limits
  • Missing-data collapse. Paths with all-NaN capacity (REZs absent from initial_transmission_limits, or flow paths with blank cells) collapse to a single path_id-only row; the translator applies a default. Now also logged at WARNING so silently-defaulted paths are visible.
  • Adds regional_granularity support to the new-format templater. nem_regions filters intra-region flow paths and re-keys cross-region ones; single_region drops flow paths and retargets REZ paths to a single "NEM" geo. Schema loosened: subregion_id is optional and "NEM" is a valid region_id.
  • Adds a Logging section to CLAUDE.md codifying when each level is appropriate.
  • Heads-up: this PR also refactors the paths templater that just landed in the previous PR. transmission_paths.py is folded into transmission.py because paths and limits share IASR sources and most of the extraction helpers are now cross-cutting. Apologies for putting you through a second pass here — in hindsight the paths + limits work should have been a single PR.

Example output:

  path_id   direction  timeslice         capacity
  CQ-NQ     forward    peak_demand       1200
  CQ-NQ     reverse    winter_reference  1910
  Q1-NQ     forward    peak_demand       750
  Q1-NQ     reverse    peak_demand       750
  Q3-NQ     (NaN)      (NaN)             (NaN)    ← collapsed: no explicit limits

Where the changes live:

  src/ispypsa/
  ├── templater/
  │   ├── transmission.py            ← new; orchestrator + flow/REZ extraction + collapse + granularity aggregation (replaces transmission_paths.py)
  │   ├── geography.py               ← granularity-aware (sub_regions / nem_regions / single_region)
  │   ├── mappings.py                ← _SINGLE_REGION_ID = "NEM"
  │   ├── transmission_paths.py      ← removed
  │   └── create_template.py         ← wires limits + granularity through to both templaters
  ├── iasr_table_caching/
  │   └── local_cache.py             ← caches initial_transmission_limits
  └── validation/schemas/
      ├── network_geography.yaml             ← subregion_id optional; region_id allows "NEM"
      └── network_transmission_paths.yaml    ← geo codes vary with granularity

  tests/
  ├── test_templater/
  │   ├── test_transmission.py                      ← full-frame content + collapse + log coverage
  │   ├── test_transmission_nem_regions.py          ← new
  │   ├── test_transmission_single_region.py        ← new
  │   ├── test_geography.py                         ← extended for the two new granularities
  │   ├── test_transmission_paths.py                ← removed
  │   └── test_create_ispypsa_inputs_template.py    ← integration test, wiring-only (presence + columns + row count)
  └── test_iasr_table_caching/
      └── test_local_cache.py                       ← updated cache table list

  CLAUDE.md                          ← new ## Logging section
  pyproject.toml                     ← isp-workbook-parser ≥ 2.8.0 (IASR v7.5)

Part of the new-format templater refactor. Combines flow path and REZ
transmission limits from IASR into a single long-format table keyed by
path_id, direction, and timeslice. Paths with no explicit capacity (REZs
absent from initial_transmission_limits, or flow paths with blank cells)
collapse to one NaN row so the translator can apply a default capacity.

Requires isp-workbook-parser 2.8.0 (IASR v7.5) for the initial transmission
limits input. Old transmission_paths module folded into a single transmission
module now that paths and limits share their IASR sources.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 97.64706% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/ispypsa/templater/geography.py 91.30% 1 Missing and 1 partial ⚠️
src/ispypsa/templater/transmission.py 98.55% 1 Missing and 1 partial ⚠️
Files with missing lines Coverage Δ
src/ispypsa/iasr_table_caching/local_cache.py 68.96% <ø> (ø)
src/ispypsa/templater/create_template.py 90.62% <100.00%> (+0.46%) ⬆️
src/ispypsa/templater/mappings.py 100.00% <100.00%> (ø)
src/ispypsa/templater/geography.py 93.93% <91.30%> (-6.07%) ⬇️
src/ispypsa/templater/transmission.py 98.55% <98.55%> (ø)

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Give each function in transmission.py a concrete input → output example in
its docstring so reviewers can grasp behaviour without tracing the body.
Covers edge cases inline (HVDC tagging, parallel-path suffixes, REZs absent
from initial_transmission_limits, all-NaN collapse, etc.).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nick-gorman and others added 2 commits April 29, 2026 13:50
Extends the new-format network templater so coarser-than-sub_regions
granularities work end-to-end (previously NotImplementedError). REZ paths
retarget their geo_to to the parent region, inter-subregional flow paths
that don't cross a NEM boundary are filtered out, and path_ids are rebuilt
preserving any parallel-path suffix. Schemas are loosened so subregion_id
is optional and "NEM" is a valid region_id.

Also surfaces missing IASR limit data via warnings so a default applied
downstream isn't invisible, and adds a Logging section to CLAUDE.md
codifying when each level is appropriate.
…egation

Previously, a path geo missing from region_lookup mapped to NaN, and the
NaN != NaN comparison in _filter_to_cross_region_flow_paths kept the path
through the filter, producing corrupted "nan-nan" path IDs downstream.
Fail fast with a clear error listing all missing geos instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@EllieKallmier EllieKallmier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid! Some extremely tiny comments and ofc a few questions cause I can't help myself. I agree it seems to make quite a bit of sense to have paths and limits treated together, and the regional granularity handling here feels simple and easy too (nice).

There are a few 'style' things I guess I've noticed while reviewing this pr and I wanted to check my understanding/interpretation of those for my own benefit going forward:

  • Functions here use a kinda new/different docstring style with the "I/O Example" section (I like this) - is this a style you'd like to use from this point on? (Loosely - obv with usual flexibility to do what makes sense based on the function)
  • With testing - each submodule test file has the all-caps column lists defined up top and I'm reading this as a kinda move towards making tests more obvious (instead of importing from a fixture/lists/mapping file in tests for example). I'm on board with this and will adopt going forward where it makes sense too!

) -> pd.DataFrame:
"""Creates the network_geography table from IASR workbook tables.

The shape of the output depends on ``regional_granularity``:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice for completeness to state here that all outputs have the geo_id, geo_type and region_id columns - contextualises the inclusion/exclusion of subregion_id column a bit more maybe?

# Collapsed row: path_id only, no direction/timeslice/capacity.
assert len(limits) == 1
assert limits.iloc[0]["path_id"] == "N1-NSW"
assert pd.isna(limits.iloc[0]["direction"])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is more a validation question than a testing question, but how much is it worth specifically checking that all three columns (direction/timeslice/capacity) that are expected to be NaN are returned here (with NaN values)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, this should (and some othe tests) should just use the standard assert against a full dataframe definition.


assert "REZs absent from initial_transmission_limits" in caplog.text
assert "N3" in caplog.text
assert "Q1" not in caplog.text.split("REZs absent")[1]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly silly question - why check the split here (but not the line above)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I guess in case Q1 appears elsewhere in the log before the warning message) - but would that be likely/possible to happen during testing?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason to be honest. Given the testing setup its not really needed. Might just change to one assert and make it explicit:

assert(
    "REZs absent from initial_transmission_limits "
    "(default will be applied downstream): ['N3']"
) in caplog.text

@nick-gorman
Copy link
Copy Markdown
Member Author

nick-gorman commented May 5, 2026

Looks solid! Some extremely tiny comments and ofc a few questions cause I can't help myself. I agree it seems to make quite a bit of sense to have paths and limits treated together, and the regional granularity handling here feels simple and easy too (nice).

There are a few 'style' things I guess I've noticed while reviewing this pr and I wanted to check my understanding/interpretation of those for my own benefit going forward:

  • Functions here use a kinda new/different docstring style with the "I/O Example" section (I like this) - is this a style you'd like to use from this point on? (Loosely - obv with usual flexibility to do what makes sense based on the function)
  • With testing - each submodule test file has the all-caps column lists defined up top and I'm reading this as a kinda move towards making tests more obvious (instead of importing from a fixture/lists/mapping file in tests for example). I'm on board with this and will adopt going forward where it makes sense too!

The new docstring style is just me trying something out to help make reading the code that claude rights a easier to understand. If it works for you too that's great! I'd say lets keep experimenting with it and see where we land??

The all-caps column lists wasn't as intentional, kind of just something that Claude did. I don't think its a big deal either way. Reflecting on it and thinking more broadly, as I write less code, and generating code takes less time, I'm defintely shifting to prioritising readability over DRY, as the main bottle neck is do I understand the code, rather than how long does it take to write (although DRY has maintainability benifits over just time to write).

@nick-gorman
Copy link
Copy Markdown
Member Author

nick-gorman commented May 5, 2026

And thinking about the fixtures vs constants in the local file further, because the file contains lost of instances where these columns names are hard coded, moving these lists to a fixtures file doesn't actually create a single change point for maintances. The lists at the top of the file just make a few places in the tests where empty dataframes are defined a bit more concise.

The transmission test files had drifted into ad-hoc shape checks (`set(...)`
membership, `len(...)`, `iloc[...]`, `pd.isna`, scoped log-text `.split`s)
alongside the canonical `assert_frame_equal` pattern. These read worse and
miss bugs the side-by-side comparison catches for free (off-by-ones, stray
columns, ordering, future log lines that mention a name we negatively probed).

Convert them all to full-DataFrame and full-log-line assertions, replace
`pd.DataFrame(columns=[...])` for empty *expected outputs* with header-only
`csv_str_to_df` calls, drop the now-unused short-column-name constants
(keeping the long IASR-name ones), and update CLAUDE.md so the rules are
stated explicitly instead of just shown by example.

Also rename test_nem_regions_preserves_dash_suffix → ..._all_flow_paths_filtered_out_returns_empty:
the test's docstring claimed it covered suffix preservation, but the
suffix-preservation case is already covered in
test_nem_regions_filters_intra_region_paths_and_rekeys via the parenthesized
Terranora path. What this test uniquely covers is populated flow-path inputs
all getting filtered out by the intra-region check with no REZs to backfill —
a distinct empty-output path through `_aggregate_to_nem_regions`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@nick-gorman nick-gorman merged commit 78b6bd7 into main May 5, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants