Optimize pg_Two(Ints/Floats)FromObj #3214

Starbuck5 · 2024-11-04T06:02:21Z

This PR takes advantage of several avenues to increase the performance of pg_TwoIntsFromObj, pg_TwoFloatsFromObj, pg_IntFromObj, and pg_FloatFromObj in base.c

Approach

Taking advantage of sequence knowledge better with PySequence_ITEM:
On main, the code for pg_TwoIntsFromObj, pg_TwoFloatsFromObj doesn't take advantage of the fact that it knows obj is a sequence, because it passes obj into generic methods (like pg_IntFromObjIndex) that have to check again that obj is a sequence. When we know that obj is a sequence we can use a faster way of getting elements out of it, PySequence_ITEM.
Faster number parsing with PyFloat_AS_DOUBLE:
If we know something is a Python float (a double), we can get the underlying double easily and with no error checking needed this way. It can then be cast to in C to whatever we want. This is the fastest way of getting a number that I tested. This is a more performant strategy than using PyFloat_AsDouble for this.
Don't do a redundant length check:
On main, if a tuple is passed in the code will use PyTuple_Size to figure out if it needs to go down the recursive codepath, then PySequence_Length to see whether it belongs at all. Instead we can use PySequence_Size (same as Length, just seems to be the preferred alias so I changed it) once at the beginning to inform those 2 other checks.

Direct Testing

I saw an average 34% improvement for pg_TwoIntsFromObj, 40% improvement for pg_TwoIntsFromObj, 17% improvement for pg_IntFromObj, and 42% improvement for pg_FloatFromObj

To test this I turned surface.get_at() into a loop that would call whatever I wanted to test with the get_at position argument 10 million times. This gave me a reliable way to test the performance of these helper functions directly and consistently.

Test script

import time

import pygame
pygame.init()

s = pygame.Surface((20,20))

def run_test(test_name, test_obj):
    start = time.time()
    s.get_at(test_obj)
    print(test_name + ":\t", time.time() - start)

run_test("int tuple", (2,15))
run_test("int list", [2,15])
run_test("float tuple", (2.0,15.0))
run_test("float list", [2.0,15.0])
run_test("vector2", pygame.Vector2(2,15))
run_test("mixed tuple", (2.0,15))
run_test("mixed list", [2.0,15])
run_test("int tuple x1", ((2,15),))
run_test("int list x1", ([2,15],))
run_test("float tuple x1", ((2.0,15.0),))
run_test("float list x1", ([2.0,15.0],))
run_test("mixed tuple x1", ((2.0,15),))
run_test("mixed list x1", ([2.0,15],))

#For pg_IntFromObj and pg_FloatFromObj
#run_test("int", 33)
#run_test("float", 33.2)

Here is my data:
https://docs.google.com/spreadsheets/d/1WBCVvzkL9HAZJ7Yo1N86-tAFhAP0d2J72Wcp8mveCl4/edit?usp=sharing

Indirect Testing
This speedup is significant enough to be seen in actual API that uses these functions internally. Here are some random functions I tested.

import time
import pygame
pygame.init()
iterations = 10000000

def run_test(test_name, test_obj, test_arg):
    start = time.time()
    for _ in range(iterations):
        test_obj(test_arg)
    print(test_name + ":\t", time.time() - start)

r = pygame.Rect(1,2,3,4)
fr = pygame.FRect(1,2,3,4)
run_test("r_move_ip", r.move_ip, (2,3))
run_test("fr_move_ip", fr.move_ip, (2.0,3.2))
run_test("r_collidep", r.collidepoint, (2,3))

# Before seconds -> After seconds  (Percent improvement)
# r_move_ip 0.63 -> 0.51 (19%)
# fr_move_ip 0.45 -> 0.38 (16%)
# r_collidep 0.63 -> 0.52 (17%)

This is a micro optimization, but I think it is a worthwhile thing to really scrutinize, as these functions are used all over the place. (Channeling my inner @itzpr3d4t0r on this PR)

pg_IntFromObjIndex doesn't take advantage of the fact that pg_TwoIntsFromObj already knows that obj is a sequence. Since obj is known to be a sequence, and negative indices are not needed, PySequence_ITEM can be used instead of PySequence_GetItem for better performance.

itzpr3d4t0r · 2024-11-04T09:07:45Z

You didn't full channel though xD, ig this could've been expanded to be more in line with pg_TwoDoublesFromObj with the whole sequencefast treatment.

Starbuck5 · 2024-11-05T02:41:16Z

Is sequencefast actually worth it for a two element sequence?

itzpr3d4t0r · 2024-11-06T08:06:30Z

I've seen 5-10% improvements. Fastcall methods that don't do much benefit the most out of the bunch. This might even speed up fblits or stuff that takes lots of positions since one could also pass only tuples as positions.

Starbuck5 · 2024-11-08T08:38:26Z

Alright you've pushed me to be better, so I've turned this simple speedup PR into a better (but way more complicated) speedup PR. I got a lot of inspiration from pg_TwoDoublesFromObj, but I haven't copied that strategy exactly here for now, and I don't plan to in this PR.

I've updated the PR description with my findings.

Starbuck5 added the Performance Related to the speed or resource usage of the project label Nov 4, 2024

Starbuck5 requested a review from a team as a code owner November 4, 2024 06:02

Optimize twoints/twofloats: add comments

4de5466

Starbuck5 marked this pull request as draft November 6, 2024 08:50

Starbuck5 force-pushed the optimize-pg-two-things-from-obj branch from a4d97a0 to 5cd46ea Compare November 8, 2024 08:03

Another pg_Int/Float helper optimization approach.

480a472

Starbuck5 force-pushed the optimize-pg-two-things-from-obj branch from 2f5e2dd to 480a472 Compare November 8, 2024 08:03

Starbuck5 marked this pull request as ready for review November 8, 2024 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize pg_Two(Ints/Floats)FromObj #3214

Optimize pg_Two(Ints/Floats)FromObj #3214

Starbuck5 commented Nov 4, 2024 •

edited

Loading

itzpr3d4t0r commented Nov 4, 2024

Starbuck5 commented Nov 5, 2024

itzpr3d4t0r commented Nov 6, 2024

Starbuck5 commented Nov 8, 2024 •

edited

Loading

Optimize pg_Two(Ints/Floats)FromObj #3214

Are you sure you want to change the base?

Optimize pg_Two(Ints/Floats)FromObj #3214

Conversation

Starbuck5 commented Nov 4, 2024 • edited Loading

itzpr3d4t0r commented Nov 4, 2024

Starbuck5 commented Nov 5, 2024

itzpr3d4t0r commented Nov 6, 2024

Starbuck5 commented Nov 8, 2024 • edited Loading

Starbuck5 commented Nov 4, 2024 •

edited

Loading

Starbuck5 commented Nov 8, 2024 •

edited

Loading