Optimise decimal casting for infallible conversions #7021
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
As mentioned in the issue there are certain conversions for which we don't need to check precision / overflow and can thus improve performance by using the infallible unary kernel.
Explanation of the optimisation:
Case 1, increasing scale:
Increasing the scale will result in multiplication with powers of 10, thus being equivalent to a "shift left" of the digits.
Every number will thus "gain" additional digits. The operation is safe when the output type gives enough precision (i.e. digits) to contain the original digits, as well as the digits gained.
This can be boiled down to the condition
input_prec + (output_scale - input_scale) <= output_prec
Example:
consider a conversion from
Decimal(5, 0)
toDecimal(8, 3)
with the number12345 * 10^0 = 12345000 * 10^(-3)
If we are starting with any number
xxxxx
, then an increase of scale by 3 will have the following effect on the representation:[xxxxx] -> [xxxxx000]
.So for this to work, every output type needs to have at least 8 digits of precision.
Case 2, decreasing scale:
Decreasing the scale will result in division with powers of 10, thus "shifting right" of the digits and adding a rounding term. We usually need less precision to represent the result. By shifting right we lose
input_scale - output_scale
digits, but adding a rounding term can result in one additional digit being required, therefore we can boil this down toinput_prec - (input_scale - output_scale) < output_prec
Example:
consider a conversion from
Decimal(5, 0)
toDecimal(3, -3)
with the number99900 * 10^0 = 99.9 * 10^(3)
which is then rounded to100 * 10^3 = 100000
If we are starting with any number
xxxxx
, then and decrease the scale by 3 will have the following effect on the representation:[xxxxx] -> [xx] (+ 1 possibly)
. In the example plus one adds an additional digit, so the conversion to work, every output type needs to have at least 3 digits of precision.A conversion to
Decimal(2, -3)
would not be possible.Performance impact
The only cases affected are between decimal128 types, for increasing scale there is a considerable improvements that I measured of around 80%.
For decreasing the scale there was an improvement of around 25%.
What changes are included in this PR?
I've added a new specialization for dealing with "safe" casts between the same decimal type both when reducing and increasing scale
A new benchmark for reducing scale
Are there any user-facing changes?
No, the behavior of the casts should stay the same.