-
-
Notifications
You must be signed in to change notification settings - Fork 186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Bulk Loading/Writing for Chunks #535
base: master
Are you sure you want to change the base?
Conversation
…ment and use exception to know if file exist
…improved usability
… and adjust file truncation logic
… in LinearFile and tmp filenames on write
… management logic
Also when flying at max speed (which you can set using Mod Menu). The Chunks are lagging behind, it seems like it first tries to load the old ones, It would make more sense to just cancel the old chunk loading "Job" and just load the new one 2025-03-01.16-20-58.mp4 |
Also when quiting using CTRL+C it crashes
|
@Snowiiii that's little hard because you don't know from the bunch of chunks:
for that, we should fetch all of the chunks the player can watch but send only needed ones. Otherwise cancel a job could cancel chunks that will be in your field of view. |
That issue is "known", i will push now a change for that. |
…ement during closure
And yes I used a pre generated world |
…mory leak in the level cache cleanup
Can someone may create a benchmark how long it takes to load/save a Chunk/Chunk group for master. So we can compare this branch with master |
More cpu usage is better (as long as it's not inefficient) because that means we are doing more work in parallel |
You need to create a new Level from the root folder for each test, or else it will cache. Also why are you sorting stuff? Also you uploaded the test folder, maybe add it to git ignore? |
Also you added the temp dir... read the comment right above it |
This is because the return order from the fetch is random, so its possible if you only retrieve a part of all of generated chunks that the ones that we use in the benchmark are grouped (for example all of them being part of the same file). About the cache thing that's the reason i use the Any ways i will fix the thing about temps dirs and test folder thanks, actually this explain somethings i was seeing in my experiments <3 . |
Clean memory only happens once if I'm reading it correctly. Also level initialization has no overhead |
Sorry thats true, i missunderstand something in the previus code, the level initialization has almost 0 overhead, and also doing the clean_memory is not enough to avoid some tipe of caching. |
…ion avoiding caching
These are the result with the updated/fixed benchmark (and running the equivalent one on master). Master Bench codeuse std::{fs, num::NonZeroU8, path::PathBuf, sync::Arc};
use criterion::{BenchmarkId, Criterion, criterion_group, criterion_main};
use pumpkin_util::math::vector2::Vector2;
use pumpkin_world::{
GlobalProtoNoiseRouter, GlobalRandomConfig, NOISE_ROUTER_ASTS, bench_create_and_populate_noise,
chunk::ChunkData, cylindrical_chunk_iterator::Cylindrical, global_path, level::Level,
};
use tokio::sync::RwLock;
fn bench_populate_noise(c: &mut Criterion) {
let seed = 0;
let random_config = GlobalRandomConfig::new(seed);
let base_router =
GlobalProtoNoiseRouter::generate(NOISE_ROUTER_ASTS.overworld(), &random_config);
c.bench_function("overworld noise", |b| {
b.iter(|| bench_create_and_populate_noise(&base_router, &random_config));
});
}
async fn test_reads(root_dir: PathBuf, positions: Vec<Vector2<i32>>) {
let level = Level::from_root_folder(root_dir);
let (send, mut recv) = tokio::sync::mpsc::channel(positions.len());
let rt = tokio::runtime::Handle::current();
level.fetch_chunks(&positions, send, &rt);
while let Some(x) = recv.recv().await {
// Don't compile me away!
let _ = x;
}
}
async fn test_writes(root_dir: PathBuf, chunks: Vec<(Vector2<i32>, Arc<RwLock<ChunkData>>)>) {
let level = Level::from_root_folder(root_dir);
for item in chunks {
level.write_chunk(item).await;
}
}
// Depends on config options from `./config`
fn bench_chunk_io(c: &mut Criterion) {
// System temp dirs are in-memory, so we cant use temp_dir
let root_dir = global_path!("./bench_root_tmp");
let _ = fs::remove_dir_all(&root_dir); // delete if it exists
fs::create_dir(&root_dir).unwrap(); // create the directory
let async_handler = tokio::runtime::Builder::new_current_thread()
.build()
.unwrap();
println!("Initializing data...");
// Initial writes
let mut chunks = Vec::new();
let mut positions = Vec::new();
async_handler.block_on(async {
let (send, mut recv) = tokio::sync::mpsc::channel(10);
// Our data dir is empty, so we're generating new chunks here
let level = Level::from_root_folder(root_dir.clone());
tokio::spawn(async move {
let cylindrical = Cylindrical::new(Vector2::new(0, 0), NonZeroU8::new(32).unwrap());
let chunk_positions = cylindrical.all_chunks_within();
let rt = tokio::runtime::Handle::current();
level.fetch_chunks(&chunk_positions, send, &rt);
level.clean_chunks(&chunk_positions).await;
});
while let Some((chunk, _)) = recv.recv().await {
let pos = chunk.read().await.position;
chunks.push((pos, chunk));
positions.push(pos);
}
});
// Sort by distance from origin to ensure a fair selection
// when using a subset of the total chunks for the benchmarks
chunks.sort_unstable_by_key(|chunk| chunk.0.x * chunk.0.x + chunk.0.z * chunk.0.z);
positions.sort_unstable_by_key(|pos| pos.x * pos.x + pos.z * pos.z);
// These test worst case: no caching done by `Level`
// testing with 16, 64, 256 chunks
let mut write_group = c.benchmark_group("write_chunks");
for n_chunks in [16, 64, 256] {
let chunks = &chunks[..n_chunks];
println!("Testing with {} chunks", n_chunks);
write_group.bench_with_input(
BenchmarkId::from_parameter(n_chunks),
&chunks,
|b, chunks| {
b.to_async(&async_handler)
.iter(|| test_writes(root_dir.clone(), chunks.to_vec()))
},
);
}
write_group.finish();
// These test worst case: no caching done by `Level`
// testing with 16, 64, 256 chunks
let mut read_group = c.benchmark_group("read_chunks");
for n_chunks in [16, 64, 256] {
println!("Testing with {} chunks", n_chunks);
let positions = &positions[..n_chunks];
read_group.bench_with_input(
BenchmarkId::from_parameter(n_chunks),
&positions,
|b, positions| {
b.to_async(&async_handler)
.iter(|| test_reads(root_dir.clone(), positions.to_vec()))
},
);
}
read_group.finish();
fs::remove_dir_all(&root_dir).unwrap(); // cleanup
}
criterion_group!(benches, bench_populate_noise, bench_chunk_io);
criterion_main!(benches); |
This PR implement the Bulk API for chunks loading/writing with some function changes that get leverage of the bulk operation (like reducing allocations, locks and more).
Also this includes some Anvil refactor and the creation of a Traits/API that can be re-used on entity storage system and to implements diferents Writers/Loaders (as DB cached, one that use in Cloud Storage or more optimizations).
Actual Improves
Don´t write a chunk on disk when loaded from disk.
Lock the Chunk Cache entry while generating a new chunk.
( if more threads wants that same chunks, they can wait without doing any work until the chunk is generated )
Every time a bunch of chunks (of the same player) are deallocated, the file is write only 1 time.
The Files that have ongoing writes/reads operations are stored on memory (to avoid disk readings and file lockings).
We have a DashMap for File Cache to avoid writing collisions and don`t corrupt files. (not an Anvil issue at least usually but it was a problem with the Linear format that enforce an FULL FILE WRITE)
Chunk writing operation now are made on a
path.tmp
so now the writing operation cant corrupt files ( when the writting finish, the file name is change, so the file is never swap before the full writting )Now the file Lock/Cache implements a
clean_cache
fn that ensure no memory leak along the server run.Actual Changes
LoadedData::Error((position,Err))
: Manage errors per chunk loaded on the bulk.LoadedData::Missing(position)
: Manage missing chunks (not previusly generated) becaouse the return chunks order is not enforced.LoadedData::Loaded(Data)
: Manage normal returned chunks.ChunkFileManager
. (the actual ChunkIO implementation for Anvil and Linear)