Skip to content

Improve database structure #579

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
Draft

Conversation

nick-harder
Copy link
Member

@nick-harder nick-harder commented May 5, 2025

Pull Request

Description

This PR addresses two key pain points:

  • Slow startup: Deleting millions of rows between learning runs can take ~5 minutes before a new simulation even begins.
  • Grafana errors: Missing tables or columns on first startup cause dashboard panels to raise errors.

We solve these by:

  1. Partitioning all time-series tables by simulation so dropping a partition is an O(1) metadata operation, instantly clearing old data.
  2. Bootstrapping the full schema at container init (using REAL types and enabling timescaledb/postgis) so Grafana never sees missing objects.
  3. Bulk writes via a single COPY … FROM STDIN per table inside one transaction, replacing thousands of INSERTs for multi× speedup.

These changes bring a speed-up of around 25 to 30% to the large non-learning simulations by increasing the write speed. And this allows to start new learning simulations much faster without the wait before the simulation. This is true only when using TimescaleDB and not the local DB.

Performance: Example 'base_case_2019' for a full year, current time: 14:22, new time: 10:22.

I also fixe a small issue with the unit operator still submitting tensors when not in learning mode, which can cause issues in the future as the behavior of tensors within an optimization algorithm is unpredictable.

Changes Proposed

  • Docker-init schema (docker_configs/db-init/assume_schema.sql): pre-creates every table with REAL precision and required extensions, eliminating Grafana errors on first use.
  • LIST-partitioned tables: market_meta, market_dispatch, unit_dispatch, rl_params, grid_flows, kpis now use PARTITION BY LIST (simulation), and per-run partitions are created/dropped for instant deletes.
  • Bulk COPY writes: Refactored store_dfs to stream DataFrames via COPY … FROM STDIN in a single transaction, cutting write overhead by an order of magnitude.
  • Dynamic table creation: Catch NoSuchTableError to auto-bootstrap new output tables with df.head(0).to_sql(), so no manual schema changes are ever needed.

Testing

Tested using timescaleDB and localdb, all code work fine. Grafana is also operational.

Checklist

Please check all applicable items:

  • Code changes are sufficiently documented (docstrings, inline comments, doc folder updates)
  • New unit tests added for new features or bug fixes
  • Existing tests pass with the changes
  • Reinforcement learning examples are operational (for DRL-related changes)
  • Code tested with both local and Docker databases
  • Code follows project style guidelines and best practices
  • Changes are backwards compatible, or deprecation notices added
  • New dependencies added to pyproject.toml
  • A note for the release notes doc/release_notes.rst of the upcoming release is included
  • Consent to release this PR's code under the GNU Affero General Public License v3.0

@nick-harder nick-harder requested a review from maurerle May 5, 2025 14:51
Copy link

codecov bot commented May 5, 2025

Codecov Report

Attention: Patch coverage is 41.07143% with 132 lines in your changes missing coverage. Please review.

Project coverage is 79.30%. Comparing base (53e9fb3) to head (e86a04f).

Files with missing lines Patch % Lines
assume/common/outputs.py 38.02% 132 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #579      +/-   ##
==========================================
- Coverage   79.67%   79.30%   -0.38%     
==========================================
  Files          52       52              
  Lines        7416     7528     +112     
==========================================
+ Hits         5909     5970      +61     
- Misses       1507     1558      +51     
Flag Coverage Δ
pytest 79.30% <41.07%> (-0.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maurerle
Copy link
Member

maurerle commented May 6, 2025

Very cool that you are looking to improve the performance.

We currently are not using some crucial features of timescaledb - which are the hypertables - which is basically what you try to achieve, but partition by time.
Furthermore I would rather investigate having an index on the simulation_id.
So I don't think this PR is the best long term solution.

I am currently quite sparse on time, but would try to file a PR with a proper solution in the next weeks..

@nick-harder
Copy link
Member Author

nick-harder commented May 6, 2025

@maurerle hey, thanks! I considered these hypertables but because they are using the time chunks I decided against that idea. Also the main thing here is to introduce a proper database schema with proper keys and rules so we can see if something is broken and being logged somehow twice and so on. Now all entries are unique and it is actively controlled by the database if some double values are added, which should not be the case. Also, I have introduced the indexing for different tables, which should improve the speed of the dashboards. Please take a look at the schema when you have time. The speed improvements is just to compensate for the increased complexity of the database structure, so we have a more robust database with same or even better performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants