Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft on complete next-gen CDEF implementation (early WIP) #3016

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BlueSwordM
Copy link
Contributor

Hello again. It has been a while since I've made any relevant PRs to rav1e, so here goes.

Ever since a faithful meeting we had back in 2021, I knew that rav1e didn't have a higher quality full CDEF implementation, so I've always wanted rav1e to get it.

Following this issue made me think of it again today, which is why I started the work on it:
#2759

My plan to implement the full CDEF implementation will follow multiple steps in its design:

1. Full CDEF strength selection implementation

Adding the full CDEF implementation based on distortion optimization with the full set of filter strengths being available first(0-15 for primary, 0-4 for secondary).

2. Implement curve based speed pruning:

As quality increases(quantizer decrease), the CDEF strength filter search space is made smaller following a power curve function.
As speeds increase(s0 all the way to s10), the CDEF strength filter search space is made smaller quicker, and at much higher speeds, gets restricted entirely from the start.

This would mainly be for the psychovisual tune. The PSNR tune would only get static search spaces based on speed features.

3. Implement an entirely different superior CDEF_dist metric(long-term).

The current CDEF_dist metric, while still being considerably better than what aomenc and SVT-AV1 use(MSE as the dist metric), isn't the absolute best that can be currently used. My future plan is to add a subset of ssimulacra2 as a superior slower dist_metric for the psycho-visual tune, especially as it would penalize some classic encoder faults like excessive blurring and detail wiping. Repository can be found here:
https://github.com/cloudinary/ssimulacra2

Piping work for that:

  1. Implementing a YUV > XYB and XYB > YUV crate for internal color conversion(could perhaps be used to replace YCbCr internally for rav1e, but that's on a completely different scope...)

  2. Implementing everything ssimulacra2 related in Rust(biggest difficulty).

  3. Adding some SIMD to make it as abominably fast as possible.

4. Making CDEF faster

Simple as that: lower complexity, more SIMD for more architectures, etc.

That'll be all from me today.

@BlueSwordM BlueSwordM changed the title Preliminary work on complete next-gen CDEF implementation (early WIP) Draft on complete next-gen CDEF implementation (early WIP) Sep 8, 2022
@tmatth
Copy link
Member

tmatth commented Sep 9, 2022

Also resurfacing this idea you brought up:

Furthermore, since CDEF can actually hurt fidelity when a lot of noise is present, a simple noise estimation algorithm
could be used to disable CDEF filtering once enough noise reaches the threshold(also based on quantizer somewhat).

which definitely seems achievable if we can find a decent heuristic for "noise in source + quantizer will make CDEF harmful"

@BlueSwordM
Copy link
Contributor Author

Yes, and you just gave me an idea as well for further speedups and threading benefits.

If we could do it noise estimation at a transparent tile level(no actual tiles being made, just for the analysis), we could get great threading on that end since the noise estimation could easily be run in parallel, and this would also allow for greater CDEF control over the frame level method.

Now, that would require my implementation to go through tiles, so let's leave that for the end :)

@barrbrain barrbrain added the WorkInProgress Incomplete patchset label Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WorkInProgress Incomplete patchset
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants