Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main scope of this PR is to clean up and simplify the RocksDb implementation. It also aims to rely mostly on RocksDb default options instead of hardcoded values
I recommend reading this PR commit by commit.
Read cache and write cache is now set in the config file, and they both defaults to RocksDb's recommended values
Read cache can be configured 1-1024MB and defaults to 32MB per table
Write cache can be configured 1-256MB and defaults to 64MB per table
Previously block_cache (read cache) was from 8MB and up, depending on which table and the value of Memory_multiplier.
All Column families are now using the same options for simplicity.
Several hardcoded options have been removed and the default RocksDb value is now used. Here is a list of the options with the old hardcoded value and the new RocksDb default.
Database options
ColumnFamily options
Table options
Data format version:
The current data storage format version is 4 but this PR changes it to the latest version 5.
Version 5 introduces enhancements and optimizations, particularly around prefix iterators and performance improvements in how RocksDB handles certain read operations. Version 5 retains compatibility with data created using version 4. Only new data is written in version 5. No data migration is needed
Compression:
Changing compression mode to kSnappyCompression will reduce ledger size and disk utilization. However, from my testing the total compression rate was less than 5% so I recommend that we continue using uncompressed storage.
Initial testing:
I have tested this PR on an existing live RocksDb ledger for 7+ days 24/7. I have also bootstrapped a live ledger from scratch.
I have tested with maximum cache settings on a 16GB system. 8 GB was used after 48 hours and was not increasing.
My current PR node (NanoTicker) is now running on this version with RocksDb.
All testing was done on Windows.
Further improvements:
The existing implementation uses a tombstone map containing deleted entries. It also creates an event listener that flushes tombstones on delete. I don't see any point in doing this. RocksDb can handle tombstones internally. There is a comment claiming that too many tombstones can affect read performance.
Without the code that handles tombstone mapping it still runs fine (including on loads of deletes). I did not see any change in performance from this.
I decided to leave the tombstone handling as is, since I'm not certain if it may be needed in some special cases.
Resources:
https://betterprogramming.pub/navigating-the-minefield-of-rocksdb-configuration-options-246af1e1d3f9
https://github.com/facebook/rocksdb/wiki