Skip to content

Evaluate the cost of relocatable files in parallel filesystems #53810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fatteneder opened this issue Mar 21, 2024 · 5 comments
Open

Evaluate the cost of relocatable files in parallel filesystems #53810

fatteneder opened this issue Mar 21, 2024 · 5 comments
Labels
filesystem Underlying file system and functions that use it

Comments

@fatteneder
Copy link
Member

fatteneder commented Mar 21, 2024

With #49866 coming in v1.11 we changed the staleness check for cachefiles from using the mtime of the source files to computing their _crc32 hash. Because _crc32 needs to read the whole source file, it is likely more expensive to compute than querying the mtime.

So far no one seems to have noted regressions in loading times related to this change. If you do notice any, please link to this issue!

However, some noted that this change might have a notable impact on loading times on parallel filesystems.
This issue should track the status of this problem. I will update the list below regularly.


@fatteneder fatteneder added the filesystem Underlying file system and functions that use it label Mar 21, 2024
@sloede
Copy link
Contributor

sloede commented Mar 23, 2024

Thank you very much for taking this up. One possible solution I see is to not cache source files at the time of loading, but at the time of modification for all non-dev'd packages. This would require a strong assumption that package files are not modified except through Pkg.jl operations. But then staleness could be verified by loading a single file with the precomputed crc32 hashes and comparing them against the pkgimage files' crc32s.

@fatteneder
Copy link
Member Author

fatteneder commented Mar 23, 2024

I think your proposal is close to a custom CAS system.

Regarding the restriction:
At least on my system all pkg code inside ~/.julia/packages/... is already read-only. However, I vaguely remember having seen an issue about this not being the case on all platforms, but I can't find it at the moment. Anyways, I think this makes sense, because the source files are released code that can be frozen.

Regarding keeping hashes in a separate file:
This requires perfect synchronization with the sources, and is susceptible to cases where someone (for some reason) swaps out source files.
Another solution would work with a virtual filesystem that stores the hash alongside the inode data. This would solve the synchronization issue, but it also requires more work for support on all platforms.
And yet another solution could use the LLVM CAS library.

These thoughts raise the questions:

  • How foolproof does the solution have to be?
  • Is it enough to put out a disclaimer saying that altering the ~/.julia/packages folder comes at the user's own risk for malfunction?
  • How likely is it that people tinker with this data?

@Sbozzolo
Copy link
Contributor

Sbozzolo commented Apr 2, 2024

I recently started looking at the cost of depot operations for CliMA on the distributed filesystem of our cluster. Given the large number of (small) files involved, such operations are already relatively expensive. Is there a small test I can run to benchmark the impact of this change?

@fatteneder
Copy link
Member Author

Is there a small test I can run to benchmark the impact of this change?

Not yet. We are still trying to schedule a meeting, but some are on holiday atm.
If you would like to join ping me on slack.

Otherwise, I think we will provide some benchmark code + reference results here for posterity.

@Sbozzolo
Copy link
Contributor

Sbozzolo commented Apr 3, 2024

Otherwise, I think we will provide some benchmark code + reference results here for posterity.

Thanks! Looking forward to it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
filesystem Underlying file system and functions that use it
Projects
None yet
Development

No branches or pull requests

3 participants