-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use approximate entropy for WriterCounter #3106
base: master
Are you sure you want to change the base?
Conversation
This improves performance at a small cost to efficiency
@maj160 what improvements did you see in your benchmarking? Mine is only showing 1-2% speedup for this change. |
Codecov ReportBase: 86.80% // Head: 86.83% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3106 +/- ##
==========================================
+ Coverage 86.80% 86.83% +0.03%
==========================================
Files 83 83
Lines 33330 33331 +1
==========================================
+ Hits 28931 28943 +12
+ Misses 4399 4388 -11
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
src/ec.rs
Outdated
* print([x for x in (stats())]) | ||
*/ | ||
// Units of 1/128 of a bit | ||
const ENTROPY_LOOKUP: [u16; 513] = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit like a subtle performance bug that has popped up before.
const ENTROPY_LOOKUP: [u16; 513] = [ | |
const ENTROPY_LOOKUP: &[u16; 513] = &[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow I had no idea that this could lead to performance issues. I'm really curious if there's any Github issue on Rustc talking about this.
Co-authored-by: David Michael Barr <[email protected]>
Original commit author here: The lookup table values can probably be improved - I think the correct value is log2 of the probability at higher levels, but at lower levels the EC_MIN_PROB factor comes into play and complicates things. The values I've put in this change are the best compromise I came up with while experimenting, but I wouldn't say they're quite "ready" yet. In the longer term it feels like this kind of optimisation is probably necessary to streamline RDO (e.g. using a fast approximate bit counter in some kind of pruning stage) Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks,
I think we could move this under preset >8 or something like that over having this approximation for all the preset. I believe for Speed0 or placebo settings, we should avoid all possible approximations. But I am open to suggestions.
And the AWCY link shared above suggests there is a hit in performance across all the resolution in Obj1-Fast and there is no speedups, is that latest and updated one?
AWCY is generally not trustworthy for its performance calculations, there's too much noise on the servers. My local benchmarks with hyperfine showed a 1-2% overall improvement at speed 2. I can do another run with it at a higher speed like speed 8, it's possible the improvement is larger there. |
I was pointing towards bdrate, as there was loss in bdrate. So i am not
sure about that.
Just to be clear, this will give us approx 0.5-1.8% bdrate loss with approx
1-2% speedup?
|
Yes. As you mentioned, because of that I'd be hesitant about blanket applying it to all speed levels. |
This improves performance at a small cost to efficiency