Evaluate the cost of relocatable files in parallel filesystems

With #49866 coming in v1.11 we changed the staleness check for cachefiles from using the `mtime` of the source files to computing their `_crc32` hash. Because `_crc32` needs to read the whole source file, it is likely more expensive to compute than querying the `mtime`.

So far no one seems to have noted regressions in loading times related to this change. If you do notice any, please link to this issue!

However, some noted that this change might have a notable impact on loading times on parallel filesystems.
This issue should track the status of this problem. I will update the list below regularly.

---

- Related to this is a report from @sloede in the initial relocation issue, https://github.com/JuliaLang/julia/issues/47943#issuecomment-1399214984. The Trixi.jl folks already have a workflow setup that we can utilize for a benchmark. We are currently preparing a meeting to see how we can evaluate this. 

- Assuming the impact is not to be neglected, there is at least one way forward: @vchuravy suggested to counteract this by implementing a [content-aware-storage (CAS) system](https://en.wikipedia.org/wiki/Content_storage_management).

- In https://hackmd.io/@Je8OcLYBQr2ociLAtslIug/BJvv7G9pa I prepared a draft for a blogpost that we want to publish that explains how to utilize this relocation feature. Eventually, this report should be shared on discourse, ideally also including an answer to whether this has an impact on parallel filesystems. Any feedback to that post is very much appreciated!

- Tangentially related: https://github.com/JuliaLang/julia/issues/50166. Sysimages are of the order 200 MB, at least for v1.8, v1.9, v1.10. Note that the problem with parallel filesystems seems to not be the size of a single file, but the vast number of files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Evaluate the cost of relocatable files in parallel filesystems #53810

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Evaluate the cost of relocatable files in parallel filesystems #53810

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions