Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fb] optimize dithering for WHITE and BLACK #73

Merged
merged 1 commit into from
Feb 1, 2021

Conversation

mrichards42
Copy link
Collaborator

I've noticed that some of the remarkable_puzzle animations are super slow, so I've been doing some profiling. draw_rect and whichever dithering function I use together take up about 90% of the time according to gprof, so I'm focusing on those :)

One pretty simple optimization is skipping dithering for WHITE and BLACK colors -- the dithering matrices are normalized so that they add or subtract less than 50% of the difference between two colors (so, e.g. in 2-color dithering 0.0 is shifted to between -0.49 and 0.49, which rounds back to 0.0 in all cases).

This change gives me between a 2x and 5x speedup in dithering, depending on how much gray is being drawn:

A sample of lightup, which uses a lot of gray:
# before
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 45.86      1.22     1.22 24576951     0.00     0.00  framebuffer::DITHER::BAYER_2(int, int, unsigned short)
 39.47      2.27     1.05  1253836     0.00     0.00  framebuffer::FB::draw_rect(int, int, int, int, int, int, float)

# after
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 54.08      1.06     1.06  1259815     0.00     0.00  framebuffer::FB::draw_rect(int, int, int, int, int, int, float)
 31.12      1.67     0.61 27939892     0.00     0.00  framebuffer::DITHER::BAYER_2(int, int, unsigned short)
A sample of untangle, which is almost all black and white:
# before
  %   cumulative   self               self     total           
 time   seconds   seconds     calls  ms/call  ms/call  name    
 61.84      7.34     7.34 165550334     0.00     0.00  framebuffer::DITHER::BAYER_2(int, int, unsigned short)
 35.89     11.60     4.26   2498395     0.00     0.00  framebuffer::FB::draw_rect(int, int, int, int, int, int, float)

  %   cumulative   self               self     total           
 time   seconds   seconds     calls  ms/call  ms/call  name    
 78.94      5.51     5.51   3671557     0.00     0.00  framebuffer::FB::draw_rect(int, int, int, int, int, int, float)
 16.48      6.66     1.15 240161310     0.00     0.00  framebuffer::DITHER::BAYER_2(int, int, unsigned short)

I'd also be happy to add these as separate dithering modes (e.g. BAYER_BW_2 or something), in case we want to let users opt in to this behavior. If you were rendering a grayscale bitmap, for instance, this is probably unhelpful, and just adds 2 extra comparisons per pixel.

These colors should never change due to dithering, and this skips the
costly conversion from remarkable_color to float and back.
@raisjn
Copy link
Member

raisjn commented Feb 1, 2021

this seems reasonable, thanks!

regarding animations and potential slow down, also see ddvk/remarkable2-framebuffer#42. if the rm2fb queue is filled, it will cause delay in painting until the queue settles (this is contrary to how rm1 worked, i believe).

if you turn on DEBUG in the rm2fb client/server code, you can see the update requests and draw timings - an example output can be seen here: ddvk/remarkable2-framebuffer#38 (comment). this will let you get a sense of how much repainting a particular region costs with a particular waveform.

@raisjn raisjn merged commit 1ac0cfd into rmkit-dev:master Feb 1, 2021
@raisjn
Copy link
Member

raisjn commented Feb 1, 2021

i just played with the release build with these changes, the performance is good! i had been using the debug build previously, so was very happy to see release build perf - i think it feels pretty snappy for an e-reader

(i also think the rm2fb queue getting filled is not the issue after doing some small testing)

@mrichards42
Copy link
Collaborator Author

Awesome! Yeah, it's pretty decent with a release build. Agreed that I don't think it's an issue w/ the rm2fb queue -- although that might have been part of the trouble I had w/ the dithering demo.

@raisjn
Copy link
Member

raisjn commented Feb 3, 2021

i'm curious if #78 speeds up drawing at all for you

  • uses draw_rect_fast which doesn't call update_dirty (which does a bunch of extra work for each pixel)
  • uses likely/unlikely
  • split rectangle drawing into separate cases: one case for fill, one for non-fill. non-fill is now O(N+M) instead of (N*M)

i am unsure of how you did your profiling cases above - is it just a single render? (or X renders?) of a game scene? or does it involve manual interaction with the game?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants