Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request includes two simple compiler optimizations:
-O3 -ffast-math
inCMakeLists.txt
sqr(x) {return x * x}
function fromlib/utils.cpp
Testing
I've also added a build and benchmark script, so we can run w/:
Results
So using both optimizations reduces runtime of this example by ~87%.
Notes
This is a PR to
master
, but the numbers and experiments above are based on comparison to commitc07eeae9394ab30ca8d984b2ec2e40ab4c2d2e08
, per @manujinda 's recommendation. I have not tested onmaster
, but it should be easy for you to copy/paste these changes to whichever branch is currently under development.Possible Future Work
After these optimizations, profiling via
callgrind
reveals that the majority of runtime is spent computing calls toexp
inKDE::pdf
. To get additional speedups, we'd need to makeexp
run faster. This may be possible, depending on how much precision is required, by eg. using approximateexp
functions like the ones described here.