Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable compilation and caching of GRF file tables #365

Merged
merged 5 commits into from
Feb 6, 2024

Conversation

rdw-software
Copy link
Member

Before this change - no caching, fully decode the TOC every time:

  • About 2.1 seconds to load pay_dun00 (best case)
  • About 3.6 seconds to load schg_dun01 (worst case)
  • About 2.5 seconds to load prontera (average case)

After this change - on the first load (cache miss):

  • About 2.5 seconds to load pay_dun00 (best case)
  • About 3.9 seconds to load schg_dun01 (worst case)
  • About 2.8 seconds to load prontera (average case)

After this change - every load beyond the first is a cache hit:

  • About 1.1 seconds to load pay_dun00 (best case)
  • About 2.5 seconds to load schg_dun01 (worst case)
  • About 1.4 seconds to load prontera (average case)

Limitations:

  • Cache hit/miss detection is a simple file system lookup (using the GRF file name), so name clashes may be possible
  • The eviction policy is based on the cache entry's mtime, which is problematic (but probably fine, for now)
  • Probably doesn't work with Arcturus/iRO alpha PAKs or similar archives, which however aren't supported at this time
  • The binary format is designed so that it's 100% backwards-compatible, which wastes memory and CPU time

Resolves #153.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This isn't the most useful right now, but I can imagine wanting to add more analysis steps later on. At that point it should likely be moved to the FileAnalyzer module as well.

Right now it's sufficient to determine the path lengths since it allowed tinkering with some optimizations that trade off space usage for decoding time; if compiling the entries as fixed-size blocks, can reduce the decoding time by 30 ms (from 70), but disk use increases by a lot so I'll start with a different approach.
@rdw-software rdw-software force-pushed the 153-grf-metadata-caching branch from 651442d to 143c9f0 Compare February 6, 2024 13:50
The path normalization takes a lot of time, with no reasonable way to avoid it. So dumping the decoded form in a binary format seems like an easy way to cut 1 second or more from the loading time.
This can drastically reduce the loading times if the TOC has been compiled before.
All fetch requests should go through the unified resources API, so at the end of the day this should speed up the one major bottleneck (decoding the table of contents) that exists.
@rdw-software rdw-software force-pushed the 153-grf-metadata-caching branch from 143c9f0 to aedadc4 Compare February 6, 2024 13:56
@rdw-software rdw-software merged commit b7a3263 into main Feb 6, 2024
6 checks passed
@rdw-software rdw-software deleted the 153-grf-metadata-caching branch February 6, 2024 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add caching of the GRF archive metadata to reduce startup times
1 participant