Skip to content

Comments

516 dynamic tables and columns#718

Draft
Simon-Will wants to merge 28 commits intoOpenEnergyPlatform:developfrom
Simon-Will:516-dynamic-tables-and-columns
Draft

516 dynamic tables and columns#718
Simon-Will wants to merge 28 commits intoOpenEnergyPlatform:developfrom
Simon-Will:516-dynamic-tables-and-columns

Conversation

@Simon-Will
Copy link

@Simon-Will Simon-Will commented Jan 24, 2026

A first attempt at creating tables based on the MaStR XSD files.

Checklist until this is really ready:

  • Download current documentation and create database model from it
  • Use new database model for all the insertion code
  • Use fallback XSD for when using the current documentation fails for some reason
  • Implement CSV export
    • I radically simplified the existing CSV export. It was pretty complex, joined several tables and backfilled the basic units table. I found that a bit much for an export. There's probably a way to make it work in much the same way as it used to work, but I frankly didn't want to spend the time to fully understand all of what's going on there. Let's talk about it!
  • Implement translation feature
  • Give the user an easy way to use the mastr_table_to_db_model returned from Mastr.generate_data_model. E.g. by adding a function that generates a Python code snippet with the SQLALchemy models/tables.
    • I solved this by having Mastr.generate_data_model return SQLAlchemy core tables, not ORM models. They are easy to just print and a user can then copy them to their code & modify them. They are also the best common ground. After all, some users might not use the ORM.
    • There's also a function format_mastr_table_to_db_table that makes printing easy for the user.
  • Clear up the date situation. I made a couple of changes to utils_download_bulk.py because I found date handling unnecessarily complex. Add interactive download functionality for MaStR date selection #696 #697 changes the same code and adds support for retrieving available XML download dates. If Add interactive download functionality for MaStR date selection #696 #697 is merged, we have to update this retrieval logic to also retrieve the documentation download dates.
    • I reverted my changes regarding the dates.
    • I added the docs download to the download browsing, etc. Note that some old XSD files are invalid (e.g. 20240101) and cannot be read with XMLSchema. We fall back to the XSD files in the library in that case.
  • Think about how we handle the transition from users' existing databases. Especially w.r.t. to translated databases and also all the renamed columns.
    • My proposal: Since it is extremely difficult to provide an upgrade path from the old table & column names to the way they are done now, I think we should just tell users to adjust their existing queries so they fit the newest open-mastr version. This is fine imo because this whole thing here will trigger a major version bump anyway.
    • Please let's talk about table name translations. I'm using the old names here in this new code, but would rather like to create new names that are closer to the names of the original MaStR export files.
  • Create usage examples
  • Address a couple of open points
    • How to determine primary key of tables? By hardcoding it for MaStR tables we know? Or by checking the available columns and choosing the most likely one based on some hierarchy (e.g. "Id > MastrNummer > EinheitMastrNummer > …") Cf. this code
      • I hard-coded the id column for the tables we know. For unknown future tables, a column "openMastrId" will be inserted. This is also done for the EinheitenAenderungNetzbetreiberzuordnungen table because there is no primary key.
    • How much do we want to adjust/normalize column names? Cf. this code
      • I decided to only do straightforward changes (MaStR -> MaStR, ß -> ss, deleting surrounding whitespace, etc.). No singularization/pluralization of column names à la VerknuepfteEinheitenMaStRNummern -> VerknuepfteEinheit.
    • Do we want to handle the case where adding only some columns to a table fails? Cf. this code
      • I decided not to add special handling for that.
  • Go through the library and remove newly obsolete code
  • Add tests

Type of change (CHANGELOG.md)

Added

  • Add the new method Mastr.generate_data_model that downloads the newest MaStR documentation and uses the XSD file to build SQLAlchemy models from the contained definitions

Updated

  • Update the method Mastr.download with two optional new arguments mastr_table_to_db_table, with which the user can pass their own database schema, and alter_database_tables, with which the user can prevent open-mastr from issuing any DDL statements.

Removed

  • Remove the method Mastr.translate. The user can now get English table and column names by passing english=True to the generate_data_model or download method.

Workflow checklist

Automation

Closes #516

PR-Assignee

Reviewer

  • 🐙 Follow the Reviewer Guidelines
  • 🐙 Provided feedback and show sufficient appreciation for the work done

@Simon-Will Simon-Will marked this pull request as draft January 24, 2026 15:40
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch from c85c1fe to fc5bfec Compare January 24, 2026 15:45
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch from 8550e9a to f31622c Compare February 3, 2026 18:13
@Simon-Will Simon-Will force-pushed the 516-dynamic-tables-and-columns branch 2 times, most recently from 68eb2ad to 2158b8a Compare February 5, 2026 11:35
@pt-kkraemer
Copy link
Collaborator

pt-kkraemer commented Feb 9, 2026

On 19.02.2026 a new public version of the Gesamtdatenexport will have a bugfix concerning attributes in the .xsd files:
grafik

This will probably have no effect on what you have done already right?

@Simon-Will
Copy link
Author

Interesting, thanks for pointing that out, @pt-kkraemer! It shouldn't make any difference for now because we make all attributes except the primary keys nullable anyway.

But I'll be sure to check out the difference in the XSD files to make sure I understand that point correctly.

As for this whole PR, it's almost ready now as you can see from the checklist. If anyone already has comments on the approach, please let them be heard. I don't think I will make substantial changes to the non-testing code anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamically add new tables to existing databases

4 participants