Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds OpenMP to qsort, should also improve test speed a bit #179

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sterrettm2
Copy link
Contributor

This adds OpenMP acceleration to quicksort, in addition to making some changes to the testing code. On my current testing system, this gives up to a 3x speedup, though I think more may be achieved on a stronger system.

Benchmarks:

10m
Comparing simdsort/random_10m (from /home/sterrettm/simd-sort/x86-simd-sort/.bench/main/builddir/benchexe) to simdsort/random_10m (from /home/sterrettm/simd-sort/x86-simd-sort/.bench/qsort-openmp/builddir/benchexe)
Benchmark                                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------------ 50815267
[simdsort/random_10m vs. simdsort/random_10m]/uint64_t_mean                  -0.6462         -0.6461     161374745      57099765     161274843      57081772
[simdsort/random_10m vs. simdsort/random_10m]/int64_t_mean                   -0.6722         -0.6720     161381007      52894903     161242920      52883775
[simdsort/random_10m vs. simdsort/random_10m]/uint32_t_mean                  -0.6683         -0.6684      76346854      25321800      76335248      25316127
[simdsort/random_10m vs. simdsort/random_10m]/int32_t_mean                   -0.6461         -0.6462      74568936      26387521      74555377      26374643
[simdsort/random_10m vs. simdsort/random_10m]/uint16_t_mean                  +0.0286         +0.0286     306422934     315196752     306379370     315156800
[simdsort/random_10m vs. simdsort/random_10m]/int16_t_mean                   +0.0159         +0.0159     308471111     313372843     308432095     313323090
[simdsort/random_10m vs. simdsort/random_10m]/float_mean                     -0.6593         -0.6593      75784101      25821957      75777886      25815941
[simdsort/random_10m vs. simdsort/random_10m]/double_mean                    -0.6613         -0.6613     140408876      47560519     140340074      47530458
OVERALL_GEOMEAN                                                              -0.5520         -0.5520             0             0             0             0
1m
Comparing simdsort/random_1m (from /home/sterrettm/simd-sort/x86-simd-sort/.bench/main/builddir/benchexe) to simdsort/random_1m (from /home/sterrettm/simd-sort/x86-simd-sort/.bench/qsort-openmp/builddir/benchexe)
Benchmark                                                                     Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------
[simdsort/random_1m vs. simdsort/random_1m]/uint64_t_mean                  -0.5563         -0.5564      13892838       6163743      13893768       6162738
[simdsort/random_1m vs. simdsort/random_1m]/int64_t_mean                   -0.5439         -0.5441      13943919       6359981      13944683       6358016
[simdsort/random_1m vs. simdsort/random_1m]/uint32_t_mean                  -0.4661         -0.4662       6244794       3334286       6244370       3333435
[simdsort/random_1m vs. simdsort/random_1m]/int32_t_mean                   -0.3862         -0.3861       6172364       3788625       6173740       3789904
[simdsort/random_1m vs. simdsort/random_1m]/uint16_t_mean                  -0.0138         -0.0138      31545637      31108834      31548000      31111137
[simdsort/random_1m vs. simdsort/random_1m]/int16_t_mean                   -0.0076         -0.0076      30964472      30728775      30967064      30730877467       3261958
[simdsort/random_1m vs. simdsort/random_1m]/float_mean                     -0.4568         -0.4566       6418410       3486598       6419908       3488443
[simdsort/random_1m vs. simdsort/random_1m]/double_mean                    -0.5486         -0.5488      11469483       5177166      11469799       5174866
OVERALL_GEOMEAN                                                            -0.4088         -0.4089             0             0             0             0

And to show that there does not seem to be a regression with small sizes due to the SIMD logic:

128
Benchmark                                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------------
[simdsort/random_128 vs. simdsort/random_128]/uint64_t_mean                  -0.0442         -0.0512           866           828           878           833
[simdsort/random_128 vs. simdsort/random_128]/int64_t_mean                   -0.0006         -0.0028           821           821           828           826
[simdsort/random_128 vs. simdsort/random_128]/uint32_t_mean                  +0.0152         +0.0150           583           591           587           596
[simdsort/random_128 vs. simdsort/random_128]/int32_t_mean                   +0.0210         +0.0150           582           594           588           597
[simdsort/random_128 vs. simdsort/random_128]/uint16_t_mean                  -0.0034         +0.0008          1933          1926          1939          1941
[simdsort/random_128 vs. simdsort/random_128]/int16_t_mean                   +0.0061         +0.0104          1889          1901          1896          1916
[simdsort/random_128 vs. simdsort/random_128]/float_mean                     +0.0012         +0.0022           599           600           605           606
[simdsort/random_128 vs. simdsort/random_128]/double_mean                    +0.0079         +0.0086           707           712           711           717
OVERALL_GEOMEAN                                                              +0.0003         -0.0003             0             0             0             0

In addition, this adds larger tests for quicksort to test the OpenMP logic. Out of concern for the runtime of the test suite, I modified both these new tests and the older kv tests to only use the larger sizes for the main sort, the only one that uses OpenMP and thus needs them. I believe this should result in a noticeable net reduction in the runtime of the test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant