Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXPERIMENTAL! Introduced a custom LevelDB impl with regions and better key format #6582

Draft
wants to merge 6 commits into
base: minor-next
Choose a base branch
from

Conversation

dktapps
Copy link
Member

@dktapps dktapps commented Dec 21, 2024

This new impl (which is not loadable by vanilla) is targeted at very large worlds, which experience significant I/O performance issues due to a variety of issues described in #6580.

Two main changes are made in RegionizedLevelDB:

  • First, multiple LevelDBs are used, which cover a fixed NxN chunk segment of terrain, similar to Anvil in Java. However, there's no technical constraint on these region sizes, unlike Anvil (which is limited to 32x32 chunks). Several experimental sizes are supported by default in WorldProviderManager.
  • Second, bigEndianLong(morton2d(chunkX, chunkZ)) is used for chunk keys instead of littleEndianInt(chunkX).littleEndianInt(chunkZ). This new scheme has much better cache locality than Mojang's version, which reduces overlap and costly DB compactions.

The following new provider options are available as a result of this change:

  • custom-leveldb-regions-128
  • custom-leveldb-regions-256

128 will probably be the sweet spot. 256 will generate 4x fewer regions (might be better for space & file handle usage), but will probably experience more costly compactions (though still far less expensive than multi-GB worlds in regular leveldb format).

Note that the different variations of custom-leveldb-regions-* are not cross-compatible.
Conversion between the different formats is necessary if you want to change formats.

Related issues & PRs

Related to #6580

Changes

Backwards compatibility

Should be fully backwards compatible. While significant changes were made to LevelDB, this was mainly in the interest of extracting a base class for RegionizedLevelDB to inherit from.

Follow-up

  • Add warnings for users with large worlds that their world performance will probably suffer if using leveldb
  • Make regionized leveldb the default
  • Add an easier way to export a world in a specific format from PM (instead of using convert-world.php which requires a src install)

Tests

TBD

This new impl (which is not loadable by vanilla) is targeted at very large worlds, which experience significant I/O performance issues due to a variety of issues described in #6580.

Two main changes are made in RegionizedLevelDB:
- First, multiple LevelDBs are used, which cover a fixed NxN segment of terrain, similar to Anvil in Java. However, there's no constraint on these region sizes. Several experimental sizes are supported by default in WorldProviderManager.
- Second, bigEndianLong(morton2d(chunkX, chunkZ)) is used for chunk keys instead of littleEndianInt(chunkX).littleEndianInt(chunkZ). This new scheme has much better cache locality than Mojang's version, which reduces overlap and costly DB compactions.

The following new provider options are available as a result of this change:
- custom-leveldb-regions-32
- custom-leveldb-regions-64
- custom-leveldb-regions-128
- custom-leveldb-regions-256

Smaller sizes will likely be less space-efficient, but will also probably have better performance.
Once a sweet spot is found, a default will be introduced.

Note that the different variations of custom-leveldb-regions-* are not cross-compatible.
Conversion between the different formats is necessary if you want to change formats.
these produce such small file sizes on average that the DB logs would probably take up a significant fraction of the world's footprint.
My gut instinct is that 128 will probably be the sweet spot, as on average it should sit well below the threshold for level 3 compactions, and most worlds would likely fit into a single DB.
256 is probably not worthwhile, but might be worth trying.
@dktapps dktapps added Category: Core Related to internal functionality Status: Insufficiently Tested Type: Enhancement Contributes features or other improvements to PocketMine-MP Performance labels Dec 21, 2024
@dktapps
Copy link
Member Author

dktapps commented Dec 23, 2024

Turns out 128x128 has a significantly larger performance benefit than I expected when converting worlds.

On account of the average region size at 128x128 chunks being about 50 MB, this means that level 3 compactions are avoided in the vast majority of cases. (256x256 hovers at about 200 MB, which exceeds the threshold for level 3 compactions).

In particular, converting to custom-leveldb-regions-128 from any format is several times faster than custom-leveldb-regions-256 for a 10 GB world (4 hours vs 90-100 mins).

@dktapps
Copy link
Member Author

dktapps commented Dec 28, 2024

For posterity: I hesitate to keep going with this change, because I think LevelDB is just fundamentally unsuited to the task of storing worlds, regardless of whether a single DB is used or multiple.

While Mojang's usage of it also couldn't be any more suboptimal (about the worst possible cache locality on account of the poor key structure), LevelDB is also just not suited to this job.

Compactions are always going to be costly for storing blob data (usually a few KB per key in Bedrock), because keys and values are stored together. Values have to be copied during sorting, which greatly increases the I/O cost of compactions. RocksDB BlobDB would be a much better solution for this.

Splitting up into many DBs reduces the amount of live data at any given time, meaning that only live DBs will get compaction work done. However, since compaction is all about sorting data (which doesn't really benefit Bedrock anyway since iteration & range scans are not really required beyond the scope of a single chunk), I'm questioning whether it even makes sense to use a KV-store at all.

I'm also hesitant to once again introduce a custom world format which is only supported by PocketMine-MP. The onus would be on PMMP to maintain support for it forever just like PMAnvil, except this would be rather more complicated.

If we do decide to introduce a custom format into mainline, we really need to get this right.

@dktapps dktapps added the Opinions Wanted Request for comments & opinions from the community label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Category: Core Related to internal functionality Opinions Wanted Request for comments & opinions from the community Performance Status: Insufficiently Tested Type: Enhancement Contributes features or other improvements to PocketMine-MP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant