Like RyuJIT does before running, the DivideSharp optimizes an integer division by "mostly constant" values.
DivideSharp first computes the magic parameters for division and then converts a single division instruction, such as idiv
or div
, into an equivalent code that has no such slow division instructions and whose entire division code is faster than a single division instruction.
When the divisor is a power of two like 0b1000_0000u(=128u), the division can be simplified to a right-shift operation like:
static uint D128(uint value) => value >> 7;
The above code is then compiled as follows(using SharpLab):
D128(UInt32)
L0000: mov eax, ecx //The first parameter `value` is in `ecx` because of `static`.
L0002: shr eax, 7 //Shift the eax 7 bits right
L0005: ret //return eax;
The division of unsigned integers can theoretically be transformed into a set of multiplications and shifts, as shown in the following equation:
The magic is selected by this algorithm.
Furthermore, since division by can be converted to a shift operation, this formula can be rewritten into the following C# code:
public static uint D239Custom(uint value)
{
//In C#, the `mul` instruction used by real-world compilers such as RyuJIT and Clang cannot be specified directly, so the 64-bit unsigned multiplication `imul` is used instead.
ulong v = value * 0x891ac73b;
v >>= 39;
return (uint)v;
}
The above code is then compiled as follows:
D239Custom(UInt32)
L0000: imul eax, ecx, 0x891ac73b
L0006: mov eax, eax
L0008: shr rax, 0x27
L000c: ret
However, codes like D239Custom may return inaccurate results depending on the denominator (e.g. 231).
RyuJIT and DivideSharp uses the following expression instead.
(store the value of into edx
)
then calculate
(The variable "edx" is named after the AMD64 32-bit register edx.)
And the aforementioned equation can be organized as follows:
Since does not fit into the 32-bit unsigned integer variable magic
, term adds to the numerator stored in magic
.
Now, when the divisor is 231, we can rewrite this expression in the following C# code:
public static uint D231Custom(uint value)
{
ulong v = (ulong)value * 0x1bb4a405;
v >>= 32;
value -= (uint)v;
value >>= 1;
var q = value + (uint)v;
q >>= 7;
return q;
}
The above code is then compiled as follows:
D231Custom(UInt32)
L0000: mov eax, ecx
L0002: imul rax, 0x1bb4a405
L0009: shr rax, 0x20
L000d: sub ecx, eax
L000f: shr ecx, 1
L0011: add eax, ecx
L0013: shr eax, 7
L0016: ret
If the denominator is greater than 2147483648(0x8000_0000u), as in 2147483649, the numerator cannot be more than twice its denominator.
For this reason, it is more efficient to use the if statement instead of dividing by the method described above.
In C# it looks like the following:
//The Unsafe class belongs to System.Runtime.CompilerServices
public static uint D2147483649(uint value) => value >= 2147483649 ? 1u : 0u;
public static uint D2147483649Ex(uint value) //Equivalent to value >= 2147483649 ? 1u : 0u
{
bool f = value >= 2147483649; //`cmp ecx, 0x80000001` and `setae al`
return Unsafe.As<bool, byte>(ref f); //moxzx eax, al
}
The above code is then compiled as follows:
D2147483649(UInt32) //Ternary operator
L0000: cmp ecx, 0x80000001
L0006: jae short L000b
L0008: xor eax, eax
L000a: ret
L000b: mov eax, 1
L0010: ret
D2147483649Ex(UInt32) //Unsafe.As<bool, byte>
L0000: cmp ecx, 0x80000001
L0006: setae al
L0009: movzx eax, al
L000c: ret
The UInt32Divisor generalizes the four cases mentioned above with the following code:
public uint Divide(uint value)
{
uint strategy = (uint)Strategy;
uint divisor = Divisor;
if (strategy == (uint)UInt32DivisorStrategy.Branch)
{
bool v = value >= divisor;
return Unsafe.As<bool, byte>(ref v);
}
ulong rax = value;
uint eax;
ulong multiplier = Multiplier;
int shift = Shift;
if ((strategy & 0b10u) > 0)
{
rax *= multiplier;
if ((strategy & 0b01u) > 0)
{
eax = (uint)(rax >> 32);
value -= eax;
value >>= 1;
eax += value;
rax = eax;
}
}
eax = (uint)(rax >> shift);
return eax;
}
Details
var uInt32Divisor = new UInt32Divisor(19);
var quotient = uInt32Divisor.Divide(39); //2
var remainder = uInt32Divisor.Modulo(39); //1
- Unlike
Math.DivRem
, theout
parameter isquotient
, notremainder
.
var remainder = uInt32Divisor.DivRem(39, out var quotient);
//remainder: 1
//quotient: 2
- Calculates the largest multiple of divisor less than or equal to the specified value.
var rounded = uInt32Divisor.Floor(39); //38
var remainder = uInt32Divisor.Floor(39, out var rounded);
//remainder: 1
//rounded: 38
var quotient = 38 / uInt32Divisor; //2
var remainder = 39 % uInt32Divisor; //1
Details
var dn19 = new Int32Divisor(-19); //Divisor -19
var dp19 = new Int32Divisor(19); //Divisor 19
var quotient = dn19.Divide(39); //-2
var quotient2 = dn19.Divide(-39); //2
var remainder = dn19.Modulo(39); //1
var remainder2 = dn19.Modulo(-39); //-1
- Unlike
Math.DivRem
, theout
parameter isquotient
, notremainder
.
var remainder = dn19.DivRem(39, out var quotient);
//remainder: 1
//quotient: -2
- Divides the value with divisor, truncates the quotient towards zero, and multiplies the quotient with divisor.
- Equivalent to (int)(value / divisor) _ divisor.
- NOT Equivalent to
(int)Math.Floor(value / (double)divisor) _ divisor
which internally truncates the value towards negative infinity.
var rounded = dn19.Floor(39); //It returns 38 unlike the (int)Math.Floor(39 / -19.0) * -19 returns 57
var rounded2 = dn19.Floor(-39); //-38
var rounded3 = dp19.Floor(-39); //It returns -38 unlike the (int)Math.Floor(-39 / 19.0) * 19 returns -57
var rounded4 = dp19.Floor(39); //38
- Calculates
rounded = value / divisor * divisor, remainder = value - rounded
var remainder = dn19.Floor(39, out var rounded);
//remainder: 1
//rounded: 38
var quotient = 38 / dn19; //-2
var remainder = 39 % dn19; //1