Skip to content

Potential 1.5x performance improvement for Random.Shuffle() #82838

Closed as not planned
@wainwrightmark

Description

@wainwrightmark

Description

Hi. This is not a performance problem per se but a potential for quite a big improvement.

I noticed that a Shuffle() method has been added to the Random class.

I recently did some performance optimizations in the rust rand crate (library) which made it run about 1.5x times faster . I've had a little go and gotten similar results in C# (see benchmarks below).

How it works

Shuffling an array of length n essentially involves making n random swaps between elements.
The way I've made it faster is to, instead of generating a new 32 bit random number for each swap, group the swaps together into the largest groups possible and only generate one random number per group. (I'm simplifying - counting calls to Random.Next(n) here which themselves may call _prng.Sample() more than once)

For an array of length 12, instead of generating 12 different random numbers, you generate one random number in the range [0,479001600) and use div and mod methods to work out which swaps to do. For longer arrays you use more groups and they get a bit smaller but the end result is a lot fewer calls to the comparatively slow Next(n) function.

Data

These are my benchmark results, testing the existing "Old" implementation vs my proposed "New" implementation for int arrays of length 10, 100, and 1000

Method Mean Error StdDev Median
Shuffle_10_Old 101.64 ns 0.469 ns 0.439 ns 101.73 ns
Shuffle_10_New 57.38 ns 0.663 ns 0.620 ns 57.24 ns
Shuffle_100_Old 906.13 ns 9.934 ns 9.292 ns 910.72 ns
Shuffle_100_New 594.07 ns 11.824 ns 12.142 ns 587.11 ns
Shuffle_1000_Old 9,189.39 ns 84.756 ns 79.280 ns 9,196.53 ns
Shuffle_1000_New 6,733.79 ns 132.529 ns 172.325 ns 6,844.89 ns

The new versions are all about 1.5x as fast as their equivalents.

Configuration

dotnet 8.0.100-preview.1.23115.2
Windows 11 Pro 22000.1574
x64
12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz

Breaking Changes

This change would not alter the API in any way but would be a value breaking change - the results of shuffling with the same seed would be different (but still random!) and the prng would almost always be advanced fewer steps. For this reason, if this is considered worth doing, I suggest doing it sooner rather than later to avoid annoying people who are reliant on value stability.

Pull Request

I'm happy to tidy up my code and make a pull request for this if it's considered worthwhile.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions