Skip to content

Ryu-based to_string function #627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
St-Maxwell opened this issue Feb 8, 2022 · 2 comments
Open

Ryu-based to_string function #627

St-Maxwell opened this issue Feb 8, 2022 · 2 comments
Labels
topic: IO Common input/output related features topic: strings String processing

Comments

@St-Maxwell
Copy link
Member

St-Maxwell commented Feb 8, 2022

I came up with the idea of implementing an effective function that converts floating point numbers to decimal strings without using internal IO when discussing the disp function. And recently I have tried to work it out.

The original Ryu can generate the shortest precision-preserving string of a floating point number and Ryu printf provides formatting of floating point numbers.

Based on the C and Scala version of Ryu codes, I have implemented the Fortran version of Ryu.

I think we can replace the current implementation in stdlib with Ryu-based codes. So I would like to briefly describe the API of ryu_fortran.

Currently, ryu_fortran provides four routines: f2shortest, d2shortest, d2fixed and d2exp.

use ryu, only: f2shortest, d2shortest, d2fixed, d2exp
use iso_fortran_env, only: real32, real64

write (*, "(A)") f2shortest(3.14159_real32)
write (*, "(A)") d2shortest(2.718281828_real64)
write (*, "(A)") d2fixed(1.2345678987654321_real64, 10)
write (*, "(A)") d2exp(299792458._real64, 5)

! 3.14159
! 2.718281828
! 1.2345678988
! 2.99792E+08

Interface

interface
    function f2shortest(f) result(str)
        real(kind=real32), intent(in) :: f
        character(len=:), allocatable :: str
    end function
    function d2shortest(d) result(str)
        real(kind=real64), intent(in) :: d
        character(len=:), allocatable :: str
    end function
    function d2fixed(d, precision_) result(str)
        real(kind=real64), intent(in) :: d
        integer(kind=int32), intent(in) :: precision_
        character(len=:), allocatable :: str
    end function
    function d2exp(d, precision_) result(str)
        real(kind=real64), intent(in) :: d
        integer(kind=int32), intent(in) :: precision_
        character(len=:), allocatable :: str
    end function
end interface

f2shortest and d2shortest produce shortest precision-preserving decimal strings of floating point numbers, that is, if we convert strings back to floating point values, we should get same binary representation comparing to original numbers. These two routines are suitable for cases where format is not specified. Note: f2shortest and d2shortest always print at least two digits. For example, C version of Ryu produces "1" for 1._real32 while f2shortest produces "1.0".

d2fixed and d2exp do formatting for real64 floating point numbers. The main difference between them and Fortran edit descriptors is that they don't produce "*****".

With these routines, I wrote a simple prototype of to_string for floating point numbers in app/main.f90. For stdlib, I think when format argument is not presented, f2shortest and d2shortest can be called. But when format is specified, there might be a disagreement over whether we follow Fortran convention or not. The good points of Ryu formatting are fast and that it never produces "*****". But it can not control the width of formatted values, which is sometimes required.

I hope in this issue we can discuss the above points and reach certain agreements.

P.S. Benchmark results (edited on 2022.2.9)

Benchmark for f2shortest
f2shortest Time (us): 0.2019531   Std Dev:  0.3438
internal IO Time (us): 1.6445312   Std Dev:  0.3450

Benchmark for d2shortest
d2shortest Time (us): 0.2128906   Std Dev:  0.3496
internal IO Time (us): 2.1968750   Std Dev:  0.4361

Benchmark for d2exp
d2exp Time (us): 0.2976563   Std Dev:  0.3794
internal IO Time (us): 2.0078125   Std Dev:  0.4105

Benchmark for d2fixed
d2fixed Time (us): 0.8589844   Std Dev:  0.9782
internal IO Time (us): 4.4464844   Std Dev:  4.2765
@awvwgk awvwgk added topic: IO Common input/output related features topic: strings String processing labels Feb 8, 2022
@ivan-pi
Copy link
Member

ivan-pi commented Feb 8, 2022

That's impressive! Makes me wonder what is the algorithmic bottleneck in Fortran compiler implementations. Does the benchmark print a single value or an entire array of values?

Is there an explanation for the relatively "large" standard deviation of measurements?

When I think of to_string, I don't see a reason where I'd like to match the Fortran behavior of printing ****. If anything a string buffer should be truncated at the end.

But it can not control the width of formatted values, which is sometimes required.

Can you elaborate more on which types of format specifiers don't work with Ryu?

@St-Maxwell
Copy link
Member Author

St-Maxwell commented Feb 9, 2022

@ivan-pi Thanks for your questions.

d2fixed and d2exp in fact correspond to f and es in Fortran. d2exp generates significand with absolute value greater than or equal to 1 and less than 10. For e and en, the number of digits in significand may differ from es even when same length of fractional part is specified. For example, formatted value 1.234e23 under e15.4, en15.4 and es15.4 are 0.1234E+24, 123.4000E+21 and 1.2340E+23 correspondingly. Therefore, e and en should be implemented individually if required since length and rounding of digits are different.

The benchmark is like this

do i = 1, num_samples
    int64_num = random_int64()
    d = transfer(int64_num)
    call cpu_time(t1)
    do j = 1, num_iter
        buffer = d2shortest(d)
    end do
    call cpu_time(t2)
    delta1(i) = (t2 - t1) * 1000000 / num_iter ! convert to us
end do

It should be noted that the former benchmark results were not correct. I have updated the results posted above. In addition, I have set a larger value for num_iter (from 5000 to 20000), and now standard deviation is smaller. The standard deviation of d2fixed case is large because the lengths of strings can deviate more dramatically.

And I notice that if we comment out the re-allocation operation str = str(1:index) before return, execution time of our routines can be reduced by about 0.1 µs. This indicates a potentially optimizable point.

! str = str(1:index) is removed
Benchmark for f2shortest
f2shortest Time (us): 0.1019531   Std Dev:  0.2655
internal IO Time (us): 1.6605469   Std Dev:  0.3412

Benchmark for d2shortest
d2shortest Time (us): 0.1289062   Std Dev:  0.2900
internal IO Time (us): 2.1851562   Std Dev:  0.4480

Benchmark for d2exp
d2exp Time (us): 0.2476562   Std Dev:  0.3644
internal IO Time (us): 2.0011719   Std Dev:  0.4128

Benchmark for d2fixed
d2fixed Time (us): 0.8046875   Std Dev:  0.9733
internal IO Time (us): 4.4523437   Std Dev:  4.2824

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: IO Common input/output related features topic: strings String processing
Projects
None yet
Development

No branches or pull requests

3 participants