-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix command graph generation bugs around reductions #223
Conversation
Check-perf-impact results: (b003273516680ef3e6ca0110b3678f5e) ❓ No new benchmark data submitted. ❓ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good stuff!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
0b1bf48
to
4c61835
Compare
4c61835
to
2cfd1c2
Compare
Check-perf-impact results: (dee217934841bf19e612d83adf4e7dfb)
Relative execution time per category: (mean of relative medians)
|
2cfd1c2
to
2f6687a
Compare
Check-perf-impact results: (d21ecac39af892ab1c227e6d0ae10ebf)
Relative execution time per category: (mean of relative medians)
|
I re-ran the benchmarks because there seemed to be significant jitter in the system benchmarks, but it appears that "benchmark independent task pattern with 100 tasks" is indeed slowing down, even though the change should not affect code without reductions. |
@PeterTh discovered that results of our multi-threaded benchmarks, especially system benchmarks, are not as stable and reliable as we thought, and our benchmarking setup needs some work. Aside from extremely obscure reason in instruction cache, OS scheduling or similar, I'm going to trust the command-graph benchmarks which measure this change in isolation and do not show a change in performance. |
Implementing IDAG reductions uncovered two bugs around reductions in distributed command graph generation:
I've added unit tests for both cases.