Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Speed up the loading of large tables (#2026)
This fixes a performance issue that @rbruijnshkv encountered trying to initialize a model with a `Basin / time` column of 6 million rows, spread over 1000 Basin nodes. It spent around 1-2 seconds per Basin node on this line. `time is a StructVector`, which stores columns as vectors. By broadcasting getfield we iterated over rows generating BasinTime structs and then taking one field, which works but is much slower than just taking out the field that is already a vector. The general recommendation for such large tables is to not store them in the model database but a separate Arrow file like here: https://github.com/Deltares/Ribasim/blob/v2025.1.0/python/ribasim_testmodels/ribasim_testmodels/basic.py#L210. Doing this shrank the database from 400 to 100 MB, and also sped up initialization. This should help both formats though.
- Loading branch information