Add initial set of generation and storage schemas#93
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 🚀 New features to boost your workflow:
|
nick-gorman
left a comment
There was a problem hiding this comment.
This looks great Ellie, just a few comments / questions.
- I made a bunch of comments on just one table, but they might apply in other cases. But this should be obvious.
- In the inline comments I suggested validating the fuel types in the fuel cost table off the generator tables. But could also make sense to do this in reverse, if the fuel cost table exists require that the generator table values exist in the fuel cost table. But then this raises the question of what happens when the fuel cost table isn't provided. Another option would be to have a canonical values yaml (or csv) and we validate off it for things like fuel types and technology types etc. This is a broader issue than fuel types but just using it as an exmaple.
- Do you think we should have a standard way of separating descriptions notes as pertaining to source / IASR vs ISPyPSA behaviour.
| If costs are given by region in source data, create separate rows for each | ||
| geo_id in the region with the same cost. |
There was a problem hiding this comment.
Great idea to have this note. Do you think we should consistently use a heading like "Source notes", "Data preparation notes" to preface notes like this. Anyway not a big one, just a thought.
| Standardised technology name mapping to new entrant technologies in | ||
| `generators_new_entrant` and/or `storage_new_entrant` tables. | ||
|
|
||
| If blank: treat as geo_id-level VRE connection cost. |
There was a problem hiding this comment.
Should this be more explicit. Maybe, "If blank: the cost is applied to all new entrant technologies in geo_id which have not been provided a technology specific connection cost for the applicable year."
| type: int | ||
| required: true | ||
| description: > | ||
| Financial year in which this cost applies. |
There was a problem hiding this comment.
Maybe just "Year" to leave open the cost being applied on calendar year basis.
There was a problem hiding this comment.
Yep I hear this - maybe also though I'd add a note (but maybe in the description?) that says something along the lines of "Year type is either financial or calendar year based on config" (phrased better).
Side thought: I'm not sure how we want to handle converting data that's given as FY into calendar (or vice versa) when it's just a single value for each year? I don't think a big deal but good to take a standard approach
There was a problem hiding this comment.
On the side thought, agree, but I would kind of leave this up to the user. I.e. the templater doesn't have a calendar year option, but if the user wants to fill out the ISPyPSA tables with their own inpus they are free to specify them as calendar years.
| If absent: no dynamic marginal costs calculated. Requires later user input of | ||
| fixed or dynamic marginal cost. |
There was a problem hiding this comment.
Should we be more specific about where these inputs would need to be provided if they aren't given here?
|
|
||
| Should match fuel_type values in `generators_existing_planned` or |
| fuel_type: | ||
| type: string | ||
| required: true | ||
| description: Fuel type used by the generator or storage unit. |
There was a problem hiding this comment.
Should we note here the model behaviour if the fuel_type (or maybe even fuel_type price_node combo) for a generator is missing? Is this allowed? Maybe its the same behaviour as if the whole table is missing but just applied to a singe generator. Should this be stated at the table description level, noting that if absent behaviour applies to the whole table and on row basis. And I guess this commet applies to other tables as well.
There was a problem hiding this comment.
I have also been wondering about this, and I think it's still for me a bit of a question around whether what I've sort of proposed for the missing table case is the best way forward. But yes as it stands I think it makes sense to add some cross-validation here to the summary tables and as you say add a table-level note re: if absent behaviour.
| Note: for all power stations except for Kogan Gas, this should exactly match | ||
| the `power_station` column. | ||
|
|
||
| If absent: assume no dynamic marginal price is calculated for this model run. |
There was a problem hiding this comment.
Does this mean zero cost or a fixed cost?
There was a problem hiding this comment.
I was thinking it would mean a fixed price (user to supply), or that users would need to provide dynamic prices (if that's desired). I can clarify this wording to be more specifically about the calculation of dynamic prices and the consequence/required action.
I'm also not totally convinced this is the best way to handle this scenario, I just think that this is a case with an obvious set of options that might be desirable to users AND can offer a simplification. I would be very happy to chat more about this case and whether there's a nicer way to implement!
There was a problem hiding this comment.
Maybe if fuel_price_node isn't supplied then the user needs to add a column called fuel_price, which would just specify one fixed price. Or you could allow numeric values in fuel_price_node which override the dynamic values, and just not allow NaN. The user can set to zero if thats what they want.
| required: false | ||
| units: '%' | ||
| description: > | ||
| Maximum allowed state of charge (%). Must be between 0.0 and 100.0, |
There was a problem hiding this comment.
yes for sure - at the time I was thinking it might be easier to define column-level validation rules (in the schema) once we had a more solid plan for how the validation will be handled (i.e. if there's an existing package or smth we want to use). But yeah it's probably just as easy to define clearly now and update that as we go. I might in that case use the custom_validation attribute at the column level and start defining a suite of standard validations. Lmk if you have other ideas or preferences!
There was a problem hiding this comment.
also - could use allowed_values but atm that's defined as a list of permitted values, which doesn't translate well here; we could instead update the definition of that attribute.
There was a problem hiding this comment.
Ok, yep, I agree it could make sense to wait till we know more about how we will be doing the validation.
| type: date | ||
| required: false | ||
| description: > | ||
| Date when the storage unit begins operation. Format: %d/%m/%Y |
…, cross-table references, and validation rules
…_values_from in new entrants tables
Collection of my (edited) thoughts and updates:New schema additions for validation metadataSuper open to not doing this/changing stuff around - particularly as validation “enforcement” gets set up and might require certain structures etc.. But to summarise - shift away from embedding validation constraints as prose in
The intent is that these fields can be read directly by validation enforcement code without parsing prose. Table-level Clarified semantics of
|
nick-gorman
left a comment
There was a problem hiding this comment.
Hey Ellie,
This looks really good, big fan.
I think there are some broader questions about how we implement referential integrity, which we haven't quiet answered, similar to your question on costs_new_entrant_build. But I think that is something we can keep thinking through and doesn't need to block this PR by any means. I might open a discussion on referential integrity.
| Note: for all power stations except for Kogan Gas, this should exactly match | ||
| the `power_station` column. | ||
|
|
||
| If absent: assume no dynamic marginal price is calculated for this model run. |
There was a problem hiding this comment.
Maybe if fuel_price_node isn't supplied then the user needs to add a column called fuel_price, which would just specify one fixed price. Or you could allow numeric values in fuel_price_node which override the dynamic values, and just not allow NaN. The user can set to zero if thats what they want.
| required: false | ||
| units: '%' | ||
| description: > | ||
| Maximum allowed state of charge (%). Must be between 0.0 and 100.0, |
There was a problem hiding this comment.
Ok, yep, I agree it could make sense to wait till we know more about how we will be doing the validation.
| fuel_type: | ||
| type: string | ||
| required: true | ||
| allowed_values: ["Black Coal", "Brown Coal", "Liquid Fuel", "Gas", "Water", "Solar", "Wind", "Biomass", "Biomethane"] |
There was a problem hiding this comment.
I'd did something similar in network_geography.yaml, but I wonder if hard coding like this is actually bad as it prevents people defining new fuel costs. Thinking again of a toy example type case where I might just want to define whatever fuels I like Coal, Hydrogen, etc
Another way of doing this might be force the generation tables to only have fuel types that exist in costs_fuel_prices.
There was a problem hiding this comment.
yeahh I thought about this too, and after the chat this morning about the role of validator totally agree hard-coding isn't ideal here. And yep I like that option - will implement!
Co-authored-by: nick-gorman <40549624+nick-gorman@users.noreply.github.com>
Adds YAML validation schemas for generation and storage input tables to the initial set of ISPyPSA input table schemas.
Schemas added:
generators_existing_planned.yaml- Existing and planned generator characteristicsgenerators_new_entrant.yaml- New entrant generator technology optionsstorage_existing_planned.yaml- Existing and planned storage unit characteristicsstorage_new_entrant.yaml- New entrant storage technology options (battery, PHES)costs_connection.yaml- Connection cost datacosts_fuel_prices.yaml- Fuel price projectionscosts_new_entrant_build.yaml- Build cost projections for new entrantsemissions_reduction.yaml- Emissions reduction targets/constraintsStill to be added: policy table schemas.
Schemas follow structure as outlined in #85 and mostly follow the draft table structures given in the alternative templater review, with tweaks to match the 2026 (v7.5) IASR workbook data.