Skip to content

Conversation

@jesspav
Copy link
Collaborator

@jesspav jesspav commented Dec 9, 2025

First implementation for RS_SRID and RS_CRS.

Important: these are not fast enough. I need to invest into caching or thinking through another way of storing the CRS strings as this solution is currently too slow. I experimented with having an LRUCache in this function and it was 4x faster. Which is great, but this

Example:

> select rs_srid(rs_example());
┌───────────────────────┐
│ rs_srid(rs_example()) │
│         uint32        │
╞═══════════════════════╡
│                  4326 │
└───────────────────────┘


> select rs_crs(rs_example());
┌──────────────────────┐
│ rs_crs(rs_example()) │
│         utf8         │
╞══════════════════════╡
│ OGC:CRS84            │
└──────────────────────┘

Very slow!

     Running benches/native-raster-functions.rs (target/release/deps/native_raster_functions-e7cd54810fedb495)
native-raster-rs_crs-Array(Raster(64, 64))
                        time:   [23.708 ms 23.768 ms 23.831 ms]
                        change: [-0.8039% -0.1217% +0.4313%] (p = 0.72 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

native-raster-rs_srid-Array(Raster(64, 64))
                        time:   [24.232 ms 24.312 ms 24.395 ms]
                        change: [+1.6443% +2.1145% +2.6147%] (p = 0.00 < 0.05)
                        Performance has regressed.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I think an LRU cache may also help, for this specific case you are also deserializing all of the raster fields while you only need a single struct child to compute the CRS or SRID.

For the case of the Crs, I think you can just return a clone of that column (i.e., you shouldn't need to peek into the values at all, just clone the ArrayRef!)

@jesspav
Copy link
Collaborator Author

jesspav commented Dec 9, 2025

For the case of the Crs, I think you can just return a clone of that column (i.e., you shouldn't need to peek into the values at all, just clone the ArrayRef!)

I did consider just spitting out the same thing as we read as that would definitely be very fast, but I thought that it would create odd inconsistencies. For example, if you set a raster's CRS to 'EPSG:4326' it would spit that out. If you transform a raster to 'EPSG:4326' then its CRS would be 'OGC:CRS84'. Maybe that doesn't matter.

Either way, when we run functions like intersects we are going to need to ask if the raster with 'EPSG:4326' has the same CRS as the raster/geo with CRS 'OGC:CRS84' (after doing a fast string equality check that will fail), so having a CRS cache is still going to be important.

I have some nice flamegraphs that are pointing to quite a few additional opportunities.

@jesspav
Copy link
Collaborator Author

jesspav commented Dec 9, 2025

The current state of CRS:

LngLat:
native-raster-rs_crs-Array(Raster(64, 64)) =>                time:   [22.640 ms 22.726 ms 22.815 ms]
native-raster-rs_srid-Array(Raster(64, 64)) =>               time:   [22.817 ms 22.864 ms 22.912 ms]

EPSG:3857 :
native-raster-rs_crs-Array(Raster(64, 64)) =>                time:   [40.386 ms 40.486 ms 40.583 ms]
native-raster-rs_srid-Array(Raster(64, 64)) =>               time:   [34.926 ms 35.034 ms 35.147 ms]

After the caching and a few more changes I have it down to:

LngLat:
native-raster-rs_crs-Array(Raster(64, 64)) =>                time:   [4.0605 ms 4.0707 ms 4.0813 ms]
native-raster-rs_srid-Array(Raster(64, 64)) =>               time:   [4.0278 ms 4.0364 ms 4.0451 ms]

EPSG:3857 :
native-raster-rs_crs-Array(Raster(64, 64)) =>                time:   [4.1869 ms 4.1989 ms 4.2110 ms]
native-raster-rs_srid-Array(Raster(64, 64)) =>               time:   [4.1030 ms 4.1114 ms 4.1196 ms]
```

@jesspav
Copy link
Collaborator Author

jesspav commented Dec 9, 2025

I can also see a plus for trying to do more validation during read/build so that we already have it in the right printable/comparable format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants