Draft on complete next-gen CDEF implementation (early WIP) #3016
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello again. It has been a while since I've made any relevant PRs to rav1e, so here goes.
Ever since a faithful meeting we had back in 2021, I knew that rav1e didn't have a higher quality full CDEF implementation, so I've always wanted rav1e to get it.
Following this issue made me think of it again today, which is why I started the work on it:
#2759
My plan to implement the full CDEF implementation will follow multiple steps in its design:
1. Full CDEF strength selection implementation
Adding the full CDEF implementation based on distortion optimization with the full set of filter strengths being available first(0-15 for primary, 0-4 for secondary).
2. Implement curve based speed pruning:
As quality increases(quantizer decrease), the CDEF strength filter search space is made smaller following a power curve function.
As speeds increase(s0 all the way to s10), the CDEF strength filter search space is made smaller quicker, and at much higher speeds, gets restricted entirely from the start.
This would mainly be for the psychovisual tune. The PSNR tune would only get static search spaces based on speed features.
3. Implement an entirely different superior CDEF_dist metric(long-term).
The current CDEF_dist metric, while still being considerably better than what aomenc and SVT-AV1 use(MSE as the dist metric), isn't the absolute best that can be currently used. My future plan is to add a subset of ssimulacra2 as a superior slower dist_metric for the psycho-visual tune, especially as it would penalize some classic encoder faults like excessive blurring and detail wiping. Repository can be found here:
https://github.com/cloudinary/ssimulacra2
Piping work for that:
Implementing a YUV > XYB and XYB > YUV crate for internal color conversion(could perhaps be used to replace YCbCr internally for rav1e, but that's on a completely different scope...)
Implementing everything ssimulacra2 related in Rust(biggest difficulty).
Adding some SIMD to make it as abominably fast as possible.
4. Making CDEF faster
Simple as that: lower complexity, more SIMD for more architectures, etc.
That'll be all from me today.