Skip to content

Like RyuJIT does before running, the DivideSharp optimizes an integer division by "mostly constant" values.

License

Notifications You must be signed in to change notification settings

MineCake147E/DivideSharp

Repository files navigation

DivideSharp Logo

DivideSharp - .NET Standard Integer Division Library

Like RyuJIT does before running, the DivideSharp optimizes an integer division by "mostly constant" values.

Currently supported features

Principles of operation

DivideSharp first computes the magic parameters for division and then converts a single division instruction, such as idiv or div, into an equivalent code that has no such slow division instructions and whose entire division code is faster than a single division instruction.

The Mechanism for unsigned integer division

The divisor is a power of two

When the divisor is a power of two like 0b1000_0000u(=128u), the division can be simplified to a right-shift operation like:

static uint D128(uint value) => value >> 7;

The above code is then compiled as follows(using SharpLab):

D128(UInt32)
    L0000: mov eax, ecx //The first parameter `value` is in `ecx` because of `static`.
    L0002: shr eax, 7   //Shift the eax 7 bits right
    L0005: ret          //return eax;

The divisor is not a power of two

The division of unsigned integers can theoretically be transformed into a set of multiplications and shifts, as shown in the following equation:
\lfloor \frac{value}{divisor} \rfloor = \lfloor \frac{value \times magic}{2^n}\rfloor
The magic is selected by this algorithm.
Furthermore, since division by 2^n can be converted to a shift operation, this formula can be rewritten into the following C# code:

public static uint D239Custom(uint value)
{
    //In C#, the `mul` instruction used by real-world compilers such as RyuJIT and Clang cannot be specified directly, so the 64-bit unsigned multiplication `imul` is used instead.
    ulong v = value * 0x891ac73b;
    v >>= 39;
    return (uint)v;
}

The above code is then compiled as follows:

D239Custom(UInt32)
    L0000: imul eax, ecx, 0x891ac73b
    L0006: mov eax, eax
    L0008: shr rax, 0x27
    L000c: ret

The divisor is not a power-of-two number complex cases

However, codes like D239Custom may return inaccurate results depending on the denominator (e.g. 231).
RyuJIT and DivideSharp uses the following expression instead.
edx \leftarrow \lfloor \frac{value \times magic}{2^{32}}\rfloor (store the value of \lfloor \frac{value \times magic}{2^{32}}\rfloor into edx)
then calculate
\lfloor \frac{value}{divisor} \rfloor = \lfloor \frac{\lfloor edx + \lfloor\frac{value - edx}{2}\rfloor}{2^{n - 32}}\rfloor
(The variable "edx" is named after the AMD64 32-bit register edx.)
And the aforementioned equation can be organized as follows:
\begin{align*} \lfloor\frac{value - \lfloor \frac{value \times magic}{2^{32}}\rfloor}{2}\rfloor &= \lfloor\frac{value - value \times \frac{magic}{2^{32}}}{2}\rfloor \ &= \lfloor\frac{value \times \frac{2^{32}}{2^{32}} - value \times \frac{magic}{2^{32}}}{2}\rfloor \ &= \lfloor\frac{value \times (\frac{2^{32} - magic}{2^{32}})}{2}\rfloor \ &= \lfloor value \times \frac{(\frac{2^{32} - magic}{2^{32}})}{2}\rfloor \ &= \lfloor value \times (\frac{2^{32} - magic}{2^{33}})\rfloor \ \end{align*}
\begin{align*} \lfloor \frac{\lfloor \lfloor \frac{value \times magic}{2^{32}}\rfloor + \lfloor\frac{value - \lfloor \frac{value \times magic}{2^{32}}\rfloor}{2}\rfloor}{2^{n - 32}}\rfloor &= \lfloor \frac{ \frac{value \times magic}{2^{32}} + value \times \frac{2^{32} - magic}{2^{33}}}{2^{n - 32}}\rfloor \ &= \lfloor \frac{ value \times \frac{magic}{2^{32}} + value \times \frac{2^{32} - magic}{2^{33}}}{2^{n - 32}}\rfloor \ &= \lfloor \frac{value \times (\frac{2^{32} - magic + 2magic}{2^{33}})}{2^{n - 32}}\rfloor \ &= \lfloor \frac{value \times (\frac{2^{32} + magic}{2^{33}})}{2^{n - 32}}\rfloor \ &= \lfloor \frac{value \times (2^{32} + magic)}{2^{n + 1}}\rfloor \ \end{align*}
Since 2^{32} does not fit into the 32-bit unsigned integer variable magic, term 2^{32} + magic adds 2^{32} to the numerator stored in magic.

Now, when the divisor is 231, we can rewrite this expression in the following C# code:

public static uint D231Custom(uint value)
{
    ulong v = (ulong)value * 0x1bb4a405;
    v >>= 32;
    value -= (uint)v;
    value >>= 1;
    var q = value + (uint)v;
    q >>= 7;
    return q;
}

The above code is then compiled as follows:

D231Custom(UInt32)
    L0000: mov eax, ecx
    L0002: imul rax, 0x1bb4a405
    L0009: shr rax, 0x20
    L000d: sub ecx, eax
    L000f: shr ecx, 1
    L0011: add eax, ecx
    L0013: shr eax, 7
    L0016: ret

Corner Cases

If the denominator is greater than 2147483648(0x8000_0000u), as in 2147483649, the numerator cannot be more than twice its denominator.
For this reason, it is more efficient to use the if statement instead of dividing by the method described above. In C# it looks like the following:

//The Unsafe class belongs to System.Runtime.CompilerServices
public static uint D2147483649(uint value) => value >= 2147483649 ? 1u : 0u;
public static uint D2147483649Ex(uint value)    //Equivalent to value >= 2147483649 ? 1u : 0u
{
    bool f = value >= 2147483649;       //`cmp ecx, 0x80000001` and `setae al`
    return Unsafe.As<bool, byte>(ref f); //moxzx eax, al
}

The above code is then compiled as follows:

D2147483649(UInt32)     //Ternary operator
    L0000: cmp ecx, 0x80000001
    L0006: jae short L000b
    L0008: xor eax, eax
    L000a: ret
    L000b: mov eax, 1
    L0010: ret

D2147483649Ex(UInt32)   //Unsafe.As<bool, byte>
    L0000: cmp ecx, 0x80000001
    L0006: setae al
    L0009: movzx eax, al
    L000c: ret

Runtime Optimizations

The UInt32Divisor generalizes the four cases mentioned above with the following code:

public uint Divide(uint value)
{
    uint strategy = (uint)Strategy;
    uint divisor = Divisor;
    if (strategy == (uint)UInt32DivisorStrategy.Branch)
    {
        bool v = value >= divisor;
        return Unsafe.As<bool, byte>(ref v);
    }
    ulong rax = value;
    uint eax;
    ulong multiplier = Multiplier;
    int shift = Shift;
    if ((strategy & 0b10u) > 0)
    {
        rax *= multiplier;
        if ((strategy & 0b01u) > 0)
        {
            eax = (uint)(rax >> 32);
            value -= eax;
            value >>= 1;
            eax += value;
            rax = eax;
        }
    }
    eax = (uint)(rax >> shift);
    return eax;
}

Usage

UInt32Divisor

Details

Initialization

var uInt32Divisor = new UInt32Divisor(19);

Methods

Division
var quotient = uInt32Divisor.Divide(39); //2
Modulus
var remainder = uInt32Divisor.Modulo(39); //1
DivRem
  • Unlike Math.DivRem, the out parameter is quotient, not remainder.
var remainder = uInt32Divisor.DivRem(39, out var quotient);
//remainder: 1
//quotient: 2
Floor
  • Calculates the largest multiple of divisor less than or equal to the specified value.
var rounded = uInt32Divisor.Floor(39); //38
FloorRem
var remainder = uInt32Divisor.Floor(39, out var rounded);
//remainder: 1
//rounded: 38

Operator Overloads

Division
var quotient = 38 / uInt32Divisor;  //2
Modulus(Modulo)
var remainder = 39 % uInt32Divisor; //1

Int32Divisor

Details

Initialization

var dn19 = new Int32Divisor(-19);   //Divisor -19
var dp19 = new Int32Divisor(19);    //Divisor 19

Methods

Division
var quotient = dn19.Divide(39); //-2
var quotient2 = dn19.Divide(-39); //2
Modulus(Modulo)
var remainder = dn19.Modulo(39); //1
var remainder2 = dn19.Modulo(-39); //-1
DivRem
  • Unlike Math.DivRem, the out parameter is quotient, not remainder.
var remainder = dn19.DivRem(39, out var quotient);
//remainder: 1
//quotient: -2
AbsFloor
  • Divides the value with divisor, truncates the quotient towards zero, and multiplies the quotient with divisor.
    • Equivalent to (int)(value / divisor) _ divisor.
    • NOT Equivalent to (int)Math.Floor(value / (double)divisor) _ divisor which internally truncates the value towards negative infinity.
var rounded = dn19.Floor(39); //It returns 38 unlike the (int)Math.Floor(39 / -19.0) * -19 returns 57
var rounded2 = dn19.Floor(-39); //-38
var rounded3 = dp19.Floor(-39); //It returns -38 unlike the (int)Math.Floor(-39 / 19.0) * 19 returns -57
var rounded4 = dp19.Floor(39); //38
AbsFloorRem
  • Calculates rounded = value / divisor * divisor, remainder = value - rounded
var remainder = dn19.Floor(39, out var rounded);
//remainder: 1
//rounded: 38

Operator Overloads

Division
var quotient = 38 / dn19;  //-2
Modulus(Modulo)
var remainder = 39 % dn19; //1

About

Like RyuJIT does before running, the DivideSharp optimizes an integer division by "mostly constant" values.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages