-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opening speed and initial caching #17
Comments
Initial performance results with the current b-tree caching scheme (on variable instantiation in pyfive):
(This code is deliberately ensuring the tests are done from memory, not from disk, because file caching is hard to address. Small differences in time will matter in practice.) |
Some more figures, now including s3 access, where we will also have to think a bit about the influence of caching to be sure of what we are seeing, but for now:
|
Ah, yes, well that really wasn't fair, because pyfive came second and got the benefit of caching, so here's some farer data which avoids reusing cached data:
|
These results are somewhat perplexing though: if we look at the S3 data (given the first file is simple and small, the second is complex and bigger), and recognise that this is a calculation that requires all the data to move across home broadband, we see that the different ways of lazily loading things impact on either the opening or the variable instantiation, but scatter gun finding of information in the hdf5 file impacts strangely in the complex file. We should try a nicely packed version of it as well! |
The expected advantage here is that the pyfive library is completely threadsafe and we can do what we like in parallel with it. Next step is to see if that is a real advantage or not. |
This supercedes #3 and is directly intended to address h5netcdf performance issues.
We want to make sure we don't spend too much time instantiating file instances and we get the timing right for reading properties including b-trees. Will report information here.
The text was updated successfully, but these errors were encountered: