-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Add extensive_hints
feature to prevent performance regression for the common use-case
#1503
Conversation
load_program
feature to prevent performance regression for the broader use caseextensive_hints
feature to prevent performance regression for the common use-case
extensive_hints
feature to prevent performance regression for the common use-caseextensive_hints
feature to prevent performance regression for the common use-case
Benchmark Results for unmodified programs 🚀
|
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1503 +/- ##
=======================================
Coverage 96.82% 96.82%
=======================================
Files 96 96
Lines 39684 39744 +60
=======================================
+ Hits 38423 38482 +59
- Misses 1261 1262 +1 ☔ View full report in Codecov by Sentry. |
I suggest we first try to see if we can alleviate the hit with a faster hash. It may not be as fast as the array approach, but it will be significantly easier to maintain than having two separate cases and could come quite close. We can also make the input to the hash much smaller by simply converting the impl Hash for Relocatable {
fn hash<H: Hasher>(&self, state: &mut H) {
(self.segment_index.unsigned_abs() as u64 << 48 | self.offset as u64).hash(state);
}
} This halves the input bytes to the hash for nearly zero effort, which is typically about linear in CPU time. I used |
Just tried the suggestions in my previous comment on the draft PR #1505. The speedup is quite similar with ahash. I can clean that one up so we don't add the extra feature flag, it should Just Work (TM) for both cases. |
Gates behaviour added by #1491 under
extensive_hints
feature to prevent performance regressions on most use cases