Skip to content

Conversation

@brycekbargar
Copy link
Collaborator

The plan to speed up json transformation is to do it all in SQL using the native json functionality. A proof of concept and some napkin math has this running around 10x faster. In order to not have to duplicate the transformation queries (which will be complicated) we're adding abstractions over the native functionality in duckdb and postgres. This will allow us to write one query that is agnostic to the underlying engine.

Duckdb was in particular hard to get the versioning correct with as they like to have incompatibilities between versions. I think the version check and setting is a reasonable way to keep it consistent but we'll see if it becomes unmaintainable.

Postgres compatibility has been tested from 13.22+.

There's no actual functionality changes here but the PR was already getting huge before starting to do the actual big refactor for the transformation.

@brycekbargar brycekbargar merged commit cb81953 into library-data-platform:release-v4.0.0 Sep 26, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant