Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib_math] add is_close routines. #488

Merged
merged 18 commits into from
Dec 18, 2021
Merged

[stdlib_math] add is_close routines. #488

merged 18 commits into from
Dec 18, 2021

Conversation

zoziha
Copy link
Contributor

@zoziha zoziha commented Aug 19, 2021

  • Add is_close for real/complex value tests and checks
  • The default value of rel_tol/abs_tol
  • add all_close. (gcc 9.0 not support select rank)(use stdlib_stats solution)
  • equal_nan in is_close/all_close(need to discuss) [stdlib_math] add is_close routines. #488 (comment)
  • fypp style tests.
  • sqrt(epsilon(..))

Decsription

Determines whether the values of a and b are close.

When we are testing, is_close and all_close are very useful.

APIs are:

!! abs(a - b) <= max(rtol*max(abs(a), abs(b)), atol)
bool = [[stdlib_math(module): is_close(interface)]] (a, b [, rel_tol=sqrt(epsilon(..)), abs_tol=0.0, equal_nan=.false.]) 
bool = [[stdlib_math(module):all_close(interface)]] (a, b [, rel_tol=sqrt(epsilon(..)), abs_tol=0.0, equal_nan=.false.]) 

Prior Arts

doc/specs/stdlib_logic.md Outdated Show resolved Hide resolved
@milancurcic
Copy link
Member

Thanks, @zoziha. There's already PR #353 by @arjenmarkus which we reviewed a little bit, and discussed on a monthly call, but there didn't seem to be much enthusiasm from others. Perhaps we can revisit this now and get more feedback to see if people would like to see something like this in stdlib.

Overall, I think it's in scope and useful. I like having a function like is_close(), and I like tolerant operators in #353.

It'd be great if we can try and follow the Workflow, I think we'd more easily narrow down to a) whether a function/module is in scope; b) whether there is interest; c) what kind of API would people like to see.

@arjenmarkus
Copy link
Member

arjenmarkus commented Aug 19, 2021 via email

@ivan-pi
Copy link
Member

ivan-pi commented Aug 19, 2021

There are two more related issues:

We should consider some of the other name suggestions too. Personally I am not convinced this needs a new module. Could the stdlib_math module be home for this?

@milancurcic
Copy link
Member

I also like the math module better for this.

@zoziha zoziha changed the title Add stdlib_logic module and is_close routines. Add stdlib_math module and is_close routines. Aug 20, 2021
@zoziha zoziha changed the title Add stdlib_math module and is_close routines. [stdlib_math] add is_close routines. Aug 20, 2021
@awvwgk awvwgk added the reviewers needed This patch requires extra eyes label Aug 21, 2021
@zoziha
Copy link
Contributor Author

zoziha commented Aug 26, 2021

I re-improved is_close to support real/complex types, and put is_close in stdlib_math.
I found that I have no suitable way to implement the numpy.allclose function, unless we have the new feature of Fortran202X: dimension(..). But we can achieve the same effect as all_close like this:

bool = all(is_close(a, b [, rtol, atol]))

In addition, when I was compiling the is_close branch, I found an avoidable type conversion warning in arange, and I corrected it.

@bilderbuchi
Copy link

bilderbuchi commented Sep 2, 2021

While you are following numpy.isclose, I will caution the following notes in the docs, which was not included above:

Warning The default atol is not appropriate for comparing numbers that are much smaller than one (see Notes).

For finite values, isclose uses the following equation to test whether two floating point values are equivalent.
absolute(a - b) <= (atol + rtol * absolute(b))
Unlike the built-in math.isclose, the above equation is not symmetric in a and b – it assumes b is the reference value – so that isclose(a, b) might be different from isclose(b, a). Furthermore, the default value of atol is not zero, and is used to determine what small values should be considered close to zero. The default value is appropriate for expected values of order unity: if the expected values are significantly smaller than one, it can result in false positives. atol should be carefully selected for the use case at hand. A zero value for atol will result in False if either a or b is zero.

Mainly for the reasons emphasized above, I personally tend to prefer the standard lib math.isclose() implementation:

math.isclose(a, b, *, rel_tol=1e-09, abs_tol=0.0)

Return True if the values a and b are close to each other and False otherwise.

Whether or not two values are considered close is determined according to given absolute and relative tolerances.

rel_tol is the relative tolerance – it is the maximum allowed difference between a and b, relative to the larger absolute value of a or b. For example, to set a tolerance of 5%, pass rel_tol=0.05. The default tolerance is 1e-09, which assures that the two values are the same within about 9 decimal digits. rel_tol must be greater than zero.

abs_tol is the minimum absolute tolerance – useful for comparisons near zero. abs_tol must be at least zero.

If no errors occur, the result will be: abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol).

The IEEE 754 special values of NaN, inf, and -inf will be handled according to IEEE rules. Specifically, NaN is not considered close to any other value, including NaN. inf and -inf are only considered close to themselves.

For completeness' sake, I say this mainly as a "simple" user of isclose (e.g. when writing tests), not as a numerical algorithms practitioner deeply versed in the intricacies of floating point math. 😸

@awvwgk awvwgk added the topic: mathematics linear algebra, sparse matrices, special functions, FFT, random numbers, statistics, ... label Sep 18, 2021
@zoziha
Copy link
Contributor Author

zoziha commented Sep 27, 2021

Thank you for your suggestion @bilderbuchi ! python.math.isclose gave us a lot of inspiration, maybe without rigorous thinking, is the following implementation feasible?:

!! abs(a - b) <= rtol*max(abs(a), abs(b)) + atol
bool = [[stdlib_math(module):is_close(interface)]] (a, b [, rtol=1.0e-5, atol=0.0]) 

The value of rtol can be negotiated, and the value of atol, we default to 0.0?
My perceptual understanding is that absolute error and relative error should be added together.

@bilderbuchi
Copy link

bilderbuchi commented Sep 28, 2021

I would not have expected the errors to be added together, but as mentioned I'm not a floating-point specialist.

If there are no good reasons to do otherwise, I think there is value in following already established popular approaches, if only to make it easier for developers to be familiar with a function's behaviour (i.e. you can say in the docs "behaves like Julia and Python stdlib").

In this case, Julia's Base.isapprox linked to in #145 and Python's math.isclose do the same thing, i.e.

norm(x-y) <= max(atol, rtol*max(norm(x), norm(y)))

It probably makes sense to just follow that, if you agree with my assesssment of numpy.isclose's drawbacks (see also the PEP485 rationale on this).

Indeed, atol default should be 0.0, and rtol is probably a matter of taste.

@zoziha
Copy link
Contributor Author

zoziha commented Sep 29, 2021

Maybe the API still needs discussion, but I just tried to update stdlib_math : is_close according to @bilderbuchi 's suggestion: now the is_close function is consistent with the python stdlib math.isclose.

@zoziha zoziha marked this pull request as draft September 29, 2021 15:41
@bilderbuchi
Copy link

bilderbuchi commented Oct 1, 2021

@zoziha do you think it in-scope/a reasonable effort to also add an array form of is_close to elementwise-compare two arrays and return a boolean array, leveraging the new is_close? In the same vein, all_close could then easily be built on top of that to see if all elements are close.
This can be quite useful when testing the results of numerical routines against a reference. I noticed that that is something that @awvwgk was interested in (#494 (comment)).

@bilderbuchi
Copy link

Also, I don't know where the stdlib lies on the spectrum between "complete features, seldomly changed" and "frequent iteration", but considering the API of is_close, it might be useful to add the following switch (from numpy.isclose):

equal_nan (bool, default false): Whether to compare NaN’s as equal. If True, NaN’s in a will be considered equal to NaN’s in b in the output array.

If that is not available, checking two arrays with missing values (nan) for closeness will never succeeed because two nans are never considered close.

@zoziha
Copy link
Contributor Author

zoziha commented Oct 1, 2021

@bilderbuchi In fact, the Fortran language has limited support for arrays of uncertain rank.
I think Fortran should enhance this kind of grammar, and the relevant proposal is here: j3-fortran/fortran_proposals#157
At present, we can implement all_close in the following three ways.

  1. use all(is_close(a, b)).
  2. Similar to stdlib_stats, this will cause the compiled binary file to be a bit large, see [STDLIB_STATS] need to upgrade stdlib_stats codes about compilation efficiency #438 .
interface all_close
    logical function all_close_1(a, b, rel_tol, abs_tol)
        real(rk), intent(in) :: a(:), b(:), rel_tol, abs_tol
    end function all_close_1
    logical function all_close_2(a, b, rel_tol, abs_tol)
        real(rk), intent(in) :: a(:, :), b(:, :), rel_tol, abs_tol
    end function all_close_2
    ...
end interface all_close
  1. Implement the select rank structure with limited support:
logical function all_close(a, b, rel_tol, abs_tol) result(close)

real(rk), intent(in) :: a(..), b(..), rel_tol, abs_tol

select rank(a)
rank(1)
    select rank(b)
    rank(1)
        close = all(is_close(a, b, rel_tol, abs_tol))

    rank default
        error stop "The ranks of a and b are not equal"

    end select

rank(2)
    select rank(b)
    rank(2)
        close = all(is_close(a, b, rel_tol, abs_tol))

    rank default
        error stop "The ranks of `a` and `b` are not equal"

    end select

...

rank default
    error stop "The rank of `a` is too large to be supported"
end select

end function all_close

@zoziha
Copy link
Contributor Author

zoziha commented Oct 2, 2021

Because all_close is based on is_close, it is actually equivalent to all(is_close()), so I implemented all_close using the select rank structure.
As I mentioned in #438 , because Fortran's syntax is not flexible enough for arrays of uncertain dimensions, the stdlib_stats solution will cause the binary files compiled by the stdlib_stats package to be larger.

But it seems that gcc 9.0 does not support the select rank structure!

   14 |         select rank(a)
      |        1
Error: Unclassifiable statement at (1)
/home/runner/work/stdlib/stdlib/build/src/stdlib_math_all_close.f90:22:15:

@zoziha zoziha marked this pull request as ready for review October 2, 2021 12:51
@Romendakil
Copy link

Because all_close is based on is_close, it is actually equivalent to all(is_close()), so I implemented all_close using the select rank structure. As I mentioned in #438 , because Fortran's syntax is not flexible enough for arrays of uncertain dimensions, the stdlib_stats solution will cause the binary files compiled by the stdlib_stats package to be larger.

But it seems that gcc 9.0 does not support the select rank structure!

   14 |         select rank(a)
      |        1
Error: Unclassifiable statement at (1)
/home/runner/work/stdlib/stdlib/build/src/stdlib_math_all_close.f90:22:15:

Yes, according to their feature page, https://gcc.gnu.org/wiki/Fortran2018Status, select rank has been implemented for v10.0 in October 2019.

@zoziha
Copy link
Contributor Author

zoziha commented Oct 3, 2021

Because gcc9.0 used by stdlib does not support select rank, I can only use the stdlib_stats solution.
In the future, if stdlib is no longer compatible with gcc9.0, we can use the select rank solution again, I believe its experience is better.

@awvwgk
Copy link
Member

awvwgk commented Oct 3, 2021

Because gcc9.0 used by stdlib does not support select rank, I can only use the stdlib_stats solution.
In the future, if stdlib is no longer compatible with gcc9.0, we can use the select rank solution again, I believe its experience is better.

So far we follow GCC's recommendation of supported releases. Once GCC 12 is available, GCC 9 will most likely get dropped and we can upgrade our requirements for stdlib.

@zoziha zoziha marked this pull request as draft November 9, 2021 14:41
@zoziha zoziha marked this pull request as ready for review November 10, 2021 03:57
@zoziha
Copy link
Contributor Author

zoziha commented Nov 10, 2021

I added the equal_nan argument to is_close and all_close to control whether NANs need to be equal, and updated the corresponding test programs.

src/stdlib_math_is_close.fypp Outdated Show resolved Hide resolved
Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @zoziha . Here are some comments and suggestions.

doc/specs/stdlib_math.md Outdated Show resolved Hide resolved
doc/specs/stdlib_math.md Outdated Show resolved Hide resolved
doc/specs/stdlib_math.md Show resolved Hide resolved
doc/specs/stdlib_math.md Show resolved Hide resolved
doc/specs/stdlib_math.md Show resolved Hide resolved
doc/specs/stdlib_math.md Outdated Show resolved Hide resolved
doc/specs/stdlib_math.md Show resolved Hide resolved
doc/specs/stdlib_math.md Outdated Show resolved Hide resolved
doc/specs/stdlib_math.md Outdated Show resolved Hide resolved
src/stdlib_math_is_close.fypp Outdated Show resolved Hide resolved
@jvdp1 jvdp1 added the waiting for OP This patch requires action from the OP label Nov 27, 2021
@zoziha
Copy link
Contributor Author

zoziha commented Dec 8, 2021

Three changes:

  • Use test-drive as the test dependency for **_close test routines;
  • rel_tol argument default value: 1.0e-9 -> sqrt(epsilon(..));
  • Other updates based on recommendations.

@awvwgk awvwgk removed the waiting for OP This patch requires action from the OP label Dec 8, 2021
Copy link
Member

@awvwgk awvwgk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a small edit. I'll commit it and merge this PR.

Thank you @zoziha


contains

#! Determines whether the values of `a` and `b` are close.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#! Determines whether the values of `a` and `b` are close.
! Determines whether the values of `a` and `b` are close.

@jvdp1 jvdp1 merged commit 2601bf1 into fortran-lang:master Dec 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewers needed This patch requires extra eyes topic: mathematics linear algebra, sparse matrices, special functions, FFT, random numbers, statistics, ...
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants