You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/posts/flexible-indexing/index.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,20 +17,20 @@ _TLDR: Xarray>2025.6 has been through a major refactoring of its internals that
17
17
18
18
# Exciting new ways to slice and dice your data with Xarray!
19
19
20
-
First thing's first, *what is an `index` and why is it helpful?*
20
+
First thing's first, _what is an `index` and why is it helpful?_
21
21
22
-
> An *index* makes repeated subsetting and selection of data more efficient.
22
+
> An _index_ makes repeated subsetting and selection of data more efficient.
23
23
24
24
Examples of indexes are all around you and are a fundamental way to organize and simplify access to information. If you want a book about Natural Sciences, you can go to your local library branch and head straight to section `500`. Or if you're in the mood for a good novel go to section `800` thanks to the Dewey Decimal System [(credit to Dewey, 1876)](https://en.wikipedia.org/wiki/Dewey_Decimal_Classification)!
25
25
26
-
Some indexes are less universal and more multi-dimensional: In my local grocery store I know that aisle 12, top shelf has the best cereal. And the second shelf on aisle 1 has the yogurt. In this example, *aisles 1-12 and shelves 1-5 are the coordinates* of our grocery, but the more infomative *aisle content labels* are the indexes. Once you've mentally asigned labels to your grocery, you can get what you want quickly without needing to wander around!
26
+
Some indexes are less universal and more multi-dimensional: In my local grocery store I know that aisle 12, top shelf has the best cereal. And the second shelf on aisle 1 has the yogurt. In this example, _aisles 1-12 and shelves 1-5 are the coordinates_ of our grocery, but the more infomative _aisle content labels_ are the indexes. Once you've mentally asigned labels to your grocery, you can get what you want quickly without needing to wander around!
27
27
28
-
The same efficiencies arise in computing. Consider a simple 1D dataset consisting of measurements `Y=[10,20,30,40,50,60]` at six positions `X=[1, 2, 4, 8, 16, 32]`. *What was our measurement at `X=8`?*
28
+
The same efficiencies arise in computing. Consider a simple 1D dataset consisting of measurements `Y=[10,20,30,40,50,60]` at six positions `X=[1, 2, 4, 8, 16, 32]`. _What was our measurement at `X=8`?_
29
29
30
-
To extract the answer in code we can loop over *all* the values of `X` to find `X=8`. In Python conventions we find it at position 3, then use that to get our answer `Y[3]=40`.
30
+
To extract the answer in code we can loop over _all_ the values of `X` to find `X=8`. In Python conventions we find it at position 3, then use that to get our answer `Y[3]=40`.
31
31
32
32
> 💡 **Note:**
33
-
With only 6 coordinates, we easily see `X[3]=8`, but for large datasets we should loop over *all* the coordinates to ensure there are no repeated values! This initial pass over all the coordinates to build an *index* takes some time and may not always be desireable.
33
+
> With only 6 coordinates, we easily see `X[3]=8`, but for large datasets we should loop over _all_ the coordinates to ensure there are no repeated values! This initial pass over all the coordinates to build an _index_ takes some time and may not always be desireable.
34
34
35
35
## Pandas.Index
36
36
@@ -77,9 +77,9 @@ A lot of work over the last several years has gone into the nuts and bolts of Xa
77
77
78
78
> real-world datasets are usually more than just raw numbers; they have labels which encode information about how the array values map to locations in space, time, etc. [Xarray Docs](https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html#what-labels-enable)
79
79
80
-
We often think about metadata providing context for *measurement values* but metadata is also critical for coordinates! In particular, to align two different datasets we must ask if the coordinates are in the same coordinate system. In other words, do they share the same origin and scale?
80
+
We often think about metadata providing context for _measurement values_ but metadata is also critical for coordinates! In particular, to align two different datasets we must ask if the coordinates are in the same coordinate system. In other words, do they share the same origin and scale?
81
81
82
-
There are currently over 7000 commonly used [Coordinate Reference Systems (CRS)](https://spatialreference.org/ref/epsg/) for geospatial data in the authoritative EPSG database! And of course an infinite number of custom-defined CRSs. [xproj.CRSIndex](https://xproj.readthedocs.io/en/latest/) gives Xarray objects an automatic awareness of the coordinate reference system operations like `xr.align()` no longer succeed when they should raise an error:
82
+
There are currently over 7000 commonly used [Coordinate Reference Systems (CRS)](https://spatialreference.org/ref/epsg/) for geospatial data in the authoritative EPSG database! And of course an infinite number of custom-defined CRSs. [xproj.CRSIndex](https://xproj.readthedocs.io/en/latest/) gives Xarray objects an automatic awareness of the coordinate reference system operations like `xr.align()` no longer succeed when they should raise an error:
83
83
84
84
```python
85
85
from xproj import CRSIndex
@@ -96,7 +96,7 @@ MergeError: conflicting values/indexes on objects to be combined for coordinate
96
96
97
97
### Rasterix RasterIndex
98
98
99
-
Earlier we mentioned that coordinates often have a *functional representation*. For 2D geospatial raster images, this function often takes the form of an [Affine Transform](https://en.wikipedia.org/wiki/Affine_transformation). This how the [rasterix RasterIndex](https://github.com/xarray-contrib/rasterix) computes coordinates rather than storing them all in memory. Also alignment by comparing transforms minimizes common errors due to floating point mismatches.
99
+
Earlier we mentioned that coordinates often have a _functional representation_. For 2D geospatial raster images, this function often takes the form of an [Affine Transform](https://en.wikipedia.org/wiki/Affine_transformation). This how the [rasterix RasterIndex](https://github.com/xarray-contrib/rasterix) computes coordinates rather than storing them all in memory. Also alignment by comparing transforms minimizes common errors due to floating point mismatches.
100
100
101
101
Below is a simple example of slicing a large mosaic of GeoTiffs without ever loading the coordiantes into memory, note that a new Affine is defined after the slicing operation:
While we're extremely excited about what can _already_ be accomplished with the new indexing capabilities, there are plenty of exciting ideas for future work.
178
177
179
178
Have an idea for your own custom index? Check out [this section of the Xarray documentation](https://docs.xarray.dev/en/stable/internals/how-to-create-custom-index.html). Also check out the [A Gallery of Custom Index Examples](https://xarray-indexes.readthedocs.io)!
180
179
181
180
There are a few new indexes that will soon become part of the Xarray codebase!
0 commit comments