Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-ifying the transform submodule - checklist #2476

Open
4 of 14 tasks
MyreMylar opened this issue Sep 30, 2023 · 1 comment
Open
4 of 14 tasks

SIMD-ifying the transform submodule - checklist #2476

MyreMylar opened this issue Sep 30, 2023 · 1 comment
Labels
SIMD transform pygame.transform

Comments

@MyreMylar
Copy link
Member

MyreMylar commented Sep 30, 2023

We have pretty much reached peak current-gen SIMD-ifying of all the blitting in pygame-ce now.

The next obvious place to take these efforts is the transform submodule as that also tends to do pixel by pixel changes to large blocks of pixels. SIMD makes everything faster and pretty much every platform we are on has SSE2 equivalent (NEON on Arm) level SIMD capability, often allowing us to do math operations on at least one entire pixel at a time (4 channels) rather than per channel of a pixel - and sometimes even multiple pixels at the same time.

Their are two current-gen tiers of SIMD we are pursuing:

  • Base level - SSE2/NEON - Available almost everywhere pygame-ce is deployed (100% on the steam hardware survey).
  • Enhanced - AVX2 - 91.24% overall availability on the steam hardware survey (majority Windows (92%) & Linux (93%), only at 31% on Mac).

Basic support for adding SIMD versions of transform functions has already been added.

These are the current possibly SIMD-able transforms available in the sub-module as a checklist:

  • pygame.transform.rotate | rotate an image
  • pygame.transform.rotozoom | filtered scale and rotation
  • pygame.transform.scale2x | specialized image doubler (Is this still relevant? - needs performance metrics, looks like vendored scalar code)
  • pygame.transform.smoothscale | scale a surface to an arbitrary size smoothly - SSE2: Add SSE2 intrinsics smoothscale backend #2473
  • pygame.transform.smoothscale_by | resize to new resolution, using scalar(s) - SSE2: Add SSE2 intrinsics smoothscale backend #2473
  • pygame.transform.chop | gets a copy of an image with an interior area removed
  • pygame.transform.laplacian | find edges in a surface
  • pygame.transform.box_blur | blur a surface using box blur
  • pygame.transform.gaussian_blur | blur a surface using gaussian blur
  • pygame.transform.average_surfaces | find the average surface from many surfaces.
  • pygame.transform.average_color | finds the average color of a surface
  • pygame.transform.invert | inverts the RGB elements of a surface - SSE2 & AVX2: Add SIMD versions of the invert transform #2534
  • pygame.transform.grayscale | grayscale a surface - SSE2 & AVX2 :Add SIMD versions of the greyscale transform (attempt #2) #2432
  • pygame.transform.threshold | finds which, and how many pixels in a surface are within a threshold of a 'search_color' or a

These functions are probably not SIMD-able (please tell me if you think I'm wrong):

  • pygame.transform.flip | flip vertically and horizontally - no math, just mem copying.
  • pygame.transform.scale | resize to new resolution - directly uses a single SDL function - SDL_SoftStretch() to do the scale.
  • pygame.transform.scale_by | resize to new resolution, using scalar(s) - directly uses a single SDL function - SDL_SoftStretch() to do the scale.

I suspect invert would be the easiest one to SIMD first as it is an alpha mask and a single 255 subtraction operation across the RGB channels which should scale up very smoothly to multipixel.

Good references for invert would be the new grayscale which has had the general structure optimised for a classic surface transform (i.e. in most cases all the pixels in a surface to be transformed will be contiguous in memory and outputting to a new surface - also contiguous in memory - unlike a blit which is most often a discontinuous rows in the middle of a surface being changed by blitting a contiguous chunk of pixels on top of them). Then blit_blend_rgb_sub_avx2() for the alpha masking and actual subtraction operation.

@MyreMylar MyreMylar added SIMD transform pygame.transform labels Sep 30, 2023
@MyreMylar MyreMylar changed the title SIMD-ifying the transform submodule SIMD-ifying the transform submodule - checklist Sep 30, 2023
@MightyJosip
Copy link
Contributor

These functions are probably not SIMD-able (please tell me if you think I'm wrong):
pygame.transform.flip | flip vertically and horizontally - no math, just mem copying.

This is kinda similar what I did when I tried to SIMD draw module. And replacing memcpy with _mm256_storeu_si256 gave me solid performance boost. Now I am not sure if flip is the same thing, but it could have potential.

@MyreMylar MyreMylar self-assigned this Oct 7, 2023
@MyreMylar MyreMylar removed their assignment Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SIMD transform pygame.transform
Projects
None yet
Development

No branches or pull requests

2 participants