Individual glacier simulation netcdf's `glac` coordinate is ambiguous

Each individual glacier simulation will have variables stored as 2d arrays (`glac`x`time` or `glac`x`year`) - where `glac` is of length-1, but corresponds to the integer 0-based index of the given glacier in the model's main rgi glacier table. This should be made less ambiguous. Options are (a) remove the `glac` coordinate altogether since there's always only one per simulation, or (b) set the index of the model's `main_rgi_table` to be the "RGIId" or "01Index" so the resulting `glac` values in each output are more clear. Option (a) probably makes most sense, but would requires more restructuring - run_simulation.py, output.py, and postproc_compile_simulations.py (as well as the demo notebooks).

Hi @yelizy,

Great questions. The short answer is that this index is not used after storing the simulations, and we should probably modify this structure. Each individual simulation output will only have a single `glac` index. In fact, we could possibly just remove the `glac` index from the individual outputs altogether, since there is always only one glacier. If you are trying to access the results form an individual glacier output, you can simply index into the 0th `glac`, similar to what's done in the various example notebooks (e.g., [simple_test.ipynb](https://github.com/PyGEM-Community/PyGEM-notebooks/blob/dev/simple_test.ipynb)) When the simulations are then merged by region, the RGIId is stored along the `glacier` index. @drounce can correct me if I'm wrong, but I believe the reason the individual simulation were originally stored as 2d arrays (e.g.,`glac`x`year`) was because then it was easier to stack them regionally in post-processing.

A bit more detail: the reason the `glac` value may seem ambiguous has to do with a subtlety in how the rgi glacier table is indexed into in the run_simulation script when looping through the list of glaciers in a given run. In run_simulation.py, we [index into the rgi glacier table](https://github.com/PyGEM-Community/PyGEM/blob/b1830bb90fc91ed8c647c1272ff0848ad34e35f1/pygem/bin/run/run_simulation.py#L603). Pandas default behavior is then to store the 'name' of the resulting series based on the index of the row in your main_glacier_rgi dataframe. For example, if I do a run for 1.00570 and 1.00571 together:
`run_simulation -rgi_glac_number 1.00570 1.00571 ....`
My `main_glac_rgi` dataframe will look like so:
```
This study is focusing on 2 glaciers in region [1]
   O1Index           RGIId   CenLon  ...  rgino_str  RGIId_float  CenLon_360
0      569  RGI60-01.00570 -145.427  ...   01.00570      1.00570     214.573
1      570  RGI60-01.00571 -145.449  ...   01.00571      1.00571     214.551
```
What becomes the 'name' key in our resulting series as we loop through each glacier is the index in `main_glac_rgi` (e.g., 0 for 1.00570 and 1 for 1.00571). These are the values that get stored under the `glac` coordinate of the simulation output. So if you ran say 200+ glaciers as your post above indicates, you may will have values that correspond to the range of glaciers in your run under the `glac` index - but there should **always** be just one index per output.

If you an an entire region, the `glac` values should correspond to the RGIId -1.  For instance if we ran all of Alaska then 1.00570 would have `glac.values=569` in the output file for 1.00570. Sorry for the long-winded explanation, but does this make sense? 

Again, in summary, the `glac` value does not matter, as you will only have one in your individual outputs, but looking at the values of `glac` can certainly be confusing and we should improve this.

_Originally posted by @btobers in https://github.com/PyGEM-Community/PyGEM/discussions/150#discussioncomment-14817094_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Individual glacier simulation netcdf's `glac` coordinate is ambiguous #153

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Individual glacier simulation netcdf's glac coordinate is ambiguous #153

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Individual glacier simulation netcdf's `glac` coordinate is ambiguous #153