Add network_transmission_path_limits templater#94
Conversation
Part of the new-format templater refactor. Combines flow path and REZ transmission limits from IASR into a single long-format table keyed by path_id, direction, and timeslice. Paths with no explicit capacity (REZs absent from initial_transmission_limits, or flow paths with blank cells) collapse to one NaN row so the translator can apply a default capacity. Requires isp-workbook-parser 2.8.0 (IASR v7.5) for the initial transmission limits input. Old transmission_paths module folded into a single transmission module now that paths and limits share their IASR sources. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Give each function in transmission.py a concrete input → output example in its docstring so reviewers can grasp behaviour without tracing the body. Covers edge cases inline (HVDC tagging, parallel-path suffixes, REZs absent from initial_transmission_limits, all-NaN collapse, etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the new-format network templater so coarser-than-sub_regions granularities work end-to-end (previously NotImplementedError). REZ paths retarget their geo_to to the parent region, inter-subregional flow paths that don't cross a NEM boundary are filtered out, and path_ids are rebuilt preserving any parallel-path suffix. Schemas are loosened so subregion_id is optional and "NEM" is a valid region_id. Also surfaces missing IASR limit data via warnings so a default applied downstream isn't invisible, and adds a Logging section to CLAUDE.md codifying when each level is appropriate.
…egation Previously, a path geo missing from region_lookup mapped to NaN, and the NaN != NaN comparison in _filter_to_cross_region_flow_paths kept the path through the filter, producing corrupted "nan-nan" path IDs downstream. Fail fast with a clear error listing all missing geos instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EllieKallmier
left a comment
There was a problem hiding this comment.
Looks solid! Some extremely tiny comments and ofc a few questions cause I can't help myself. I agree it seems to make quite a bit of sense to have paths and limits treated together, and the regional granularity handling here feels simple and easy too (nice).
There are a few 'style' things I guess I've noticed while reviewing this pr and I wanted to check my understanding/interpretation of those for my own benefit going forward:
- Functions here use a kinda new/different docstring style with the "I/O Example" section (I like this) - is this a style you'd like to use from this point on? (Loosely - obv with usual flexibility to do what makes sense based on the function)
- With testing - each submodule test file has the all-caps column lists defined up top and I'm reading this as a kinda move towards making tests more obvious (instead of importing from a fixture/lists/mapping file in tests for example). I'm on board with this and will adopt going forward where it makes sense too!
| ) -> pd.DataFrame: | ||
| """Creates the network_geography table from IASR workbook tables. | ||
|
|
||
| The shape of the output depends on ``regional_granularity``: |
There was a problem hiding this comment.
Might be nice for completeness to state here that all outputs have the geo_id, geo_type and region_id columns - contextualises the inclusion/exclusion of subregion_id column a bit more maybe?
| # Collapsed row: path_id only, no direction/timeslice/capacity. | ||
| assert len(limits) == 1 | ||
| assert limits.iloc[0]["path_id"] == "N1-NSW" | ||
| assert pd.isna(limits.iloc[0]["direction"]) |
There was a problem hiding this comment.
Maybe this is more a validation question than a testing question, but how much is it worth specifically checking that all three columns (direction/timeslice/capacity) that are expected to be NaN are returned here (with NaN values)?
There was a problem hiding this comment.
Yeah, good point, this should (and some othe tests) should just use the standard assert against a full dataframe definition.
|
|
||
| assert "REZs absent from initial_transmission_limits" in caplog.text | ||
| assert "N3" in caplog.text | ||
| assert "Q1" not in caplog.text.split("REZs absent")[1] |
There was a problem hiding this comment.
possibly silly question - why check the split here (but not the line above)?
There was a problem hiding this comment.
(I guess in case Q1 appears elsewhere in the log before the warning message) - but would that be likely/possible to happen during testing?
There was a problem hiding this comment.
No good reason to be honest. Given the testing setup its not really needed. Might just change to one assert and make it explicit:
assert(
"REZs absent from initial_transmission_limits "
"(default will be applied downstream): ['N3']"
) in caplog.text
The new docstring style is just me trying something out to help make reading the code that claude rights a easier to understand. If it works for you too that's great! I'd say lets keep experimenting with it and see where we land?? The all-caps column lists wasn't as intentional, kind of just something that Claude did. I don't think its a big deal either way. Reflecting on it and thinking more broadly, as I write less code, and generating code takes less time, I'm defintely shifting to prioritising readability over DRY, as the main bottle neck is do I understand the code, rather than how long does it take to write (although DRY has maintainability benifits over just time to write). |
|
And thinking about the fixtures vs constants in the local file further, because the file contains lost of instances where these columns names are hard coded, moving these lists to a fixtures file doesn't actually create a single change point for maintances. The lists at the top of the file just make a few places in the tests where empty dataframes are defined a bit more concise. |
The transmission test files had drifted into ad-hoc shape checks (`set(...)` membership, `len(...)`, `iloc[...]`, `pd.isna`, scoped log-text `.split`s) alongside the canonical `assert_frame_equal` pattern. These read worse and miss bugs the side-by-side comparison catches for free (off-by-ones, stray columns, ordering, future log lines that mention a name we negatively probed). Convert them all to full-DataFrame and full-log-line assertions, replace `pd.DataFrame(columns=[...])` for empty *expected outputs* with header-only `csv_str_to_df` calls, drop the now-unused short-column-name constants (keeping the long IASR-name ones), and update CLAUDE.md so the rules are stated explicitly instead of just shown by example. Also rename test_nem_regions_preserves_dash_suffix → ..._all_flow_paths_filtered_out_returns_empty: the test's docstring claimed it covered suffix preservation, but the suffix-preservation case is already covered in test_nem_regions_filters_intra_region_paths_and_rekeys via the parenthesized Terranora path. What this test uniquely covers is populated flow-path inputs all getting filtered out by the intra-region check with no REZs to backfill — a distinct empty-output path through `_aggregate_to_nem_regions`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Example output:
Where the changes live: