-
-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tools for testing identitites #1671
Comments
This would be a great feature. I've thought about doing this in the past but have always been too lazy.
I don't see how this could work in general, because how would we know what arguments were legal and what shape?
It'd be easy to set it up the same way we set up expect_ad so that you supply arguments. It could be done with parameter packs for generality:
template <typename F, typename G, typename... Ts>
expect_identity(const ad_tolerances& tols, const F& lh, const G& rh, Ts... args);
I didn't understand the comment about just testing with all var. I'd like to test all levels of autodiff.
… On Feb 4, 2020, at 11:17 AM, Martin Modrák ***@***.***> wrote:
Description
Using known identities is a powerful way to test precision of our functions. Currently using identities for derivatives is a bit tedious - you need to explicitly compute derivative against each variable on both left and right hand side of the identity and compare them. The proposal is to create a function that does this for you.
Example
The goal is to be able to write something like:
auto lh = [](const var& a, const var& b) {
return stan::math::lbeta(a, b);
}
// Alternatively: auto lh = static_cast<var(*)(const var&, const var&)>(lbeta);
auto rh = [](const var& a, const var& b) {
return stan::math::log_sum_exp(lbeta(a + 1, b), lbeta(a, b + 1));
};
expect_identity("succesors", lh, rh, tolerance);
I would not expect expect_identity to check this for all possible instantiations, I believe there should be a separate test that all instantiations are numerically equal and after that we should just test with all var to make things easier.
Also note while I could test that lh - rh == 0 (with zero gradients all over) this is potentially problematic as the test tolerance could not reflect the magnitude of the actual values.
Expected output
It tests that the value and all derivates are close up to a relative tolerance using expect_near_rel
Current Version:
v3.1.0
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@bob-carpenter you are correct, the arguments have to be provided, I updated the description.
Sorry for being unclear, there are IMHO two things to consider:
Not sure if this is the best place for this discussion - should I move it to Discourse? |
I wrote a prototype implementation for the |
My preference would be to do both 1 and 2, because the implementations of all these cases can vary. If everything was just templated prim code, then we should be able to get away with only testing primitive inputs.
• Limiting to reverse mode, should we run the same test for lbeta(double, double), lbeta(var, double), lbeta(double, var) and lbeta(var,var)? I think this is not very useful. Instead of having multiple testing procedures (test_ad, this one) each going through complex templating to invoke all the variants, I think we should instead separately test that all those variants give equal results (this test also IMHO can and should be really strict, e.g. use EXPECT_FLOAT_EQ) and then perform all the tests for identities / finite diff only on the lbeta(var,var). Similarly for forward/mix mode.
• Should we run the same test for lbeta(var,var), lbeta(fvar,fvar) and lbeta(fvar<var>, fvar<var>)? This one I am less sure about, but I feel slightly opposed to it. AFAIK the forward and mixed mode are less developed and I would suspect strong tests that the reverse mode would pass could be problematic for forward/mixed mode. So in the current state it would IMHO make sense to have separate tests for expect_identity_rev, expect_identity_fwd, expect_identity_mix and invoke those separately in separate tests whose stringency would be tailored to the individual autodiff levels.
Not sure if this is the best place for this discussion - should I move it to Discourse?
Either way. It's focused enough on this issue that I'm OK keeping the discussion here.
|
Sorry for not being clear, what I meant is that I think we should separate individual testing concerns (functional identities vs. consistency between instantiations). So a typical test would look like:
Limiting ourselves to just reverse mode for now I am a bit less sure whether it makes sense to have Building Hope I am making sense. |
If it's only the complexity of templates that's leading to a proposal to test only The problem with just doing @syclik---do you have an opinion on this? |
It's that the added complexity buys us almost nothing. In all codepaths I have seen in If you believe this assumption is wrong, than my argument is invalid. But if it is correct than after we tested that the instantiations are equal, it follows that any reasonable test satisfied by Also I don't think @syclik was tagged correctly in the previous post so retagging. |
On Feb 5, 2020, at 4:35 PM, Martin Modrák ***@***.***> wrote:
It's that the added complexity buys us almost nothing. In all codepaths I have seen in rev, the results and respective gradients of foo(var, var), foo(double, var) and foo(var, double) should be bitwise equal, or at worst should satisfy EXPECT_DOUBLE_EQ (which means at most 4 ULP difference).
If you believe this assumption is wrong, than my argument is invalid.
For many functions, foo(var, var) will have a completely separate implementation from foo(double, var). For example, addition and multiplication use two completely separate vari implementations for the (var, var) signature vs. (var, double) or (double, var). Because division can't exploit symmetry, there are three separate implementations.
If all that is available is a templated prim implementation, we should be able to get by with only checking values at the primitive level and that everything else compiles.
|
I think I see where we are missing each other, so at the risk of being a bit annoying, I'll try to rephrase once more - by " In that sense the current way of testing in If such tests are (can be) satisfied, there is IMHO little need to have other tests try all instantiations. |
On second thought I am obviously making this discussion too abstract. It will probably serve all of us better if I polish my thoughts, draw a diagram or two and put it in a longer Discourse post. Thanks for your time with this so far. |
On Feb 10, 2020, at 3:36 PM, Martin Modrák ***@***.***> wrote:
I think I see where we are missing each other, so at the risk of being a bit annoying, I'll try to rephrase once more - by "foo(var, var) and foo(double, var) should be bitwise equal" I do not mean that this is guaranteed by implementation. I mean "we want them to be bitwise equal" AND "a test that fails if they are not bitwise equal would not be overly strict".
That's right. We can't assume they'll be bitwise equivalent in either value or in derivative.
In that sense the current way of testing in expect_ad is very weak because it tests with relative tolerances 1e-4 and larger against a finite diff value. So currently, if say foo(var, double) gradient is 10.001, first element of foo(var, var) gradient is 9.999 and finite diff is 10, all test pass. This to me seems undesirable.
I agree that these tests are weak, but then it seems very unlikely that these tests are going to pass when the Hessian fails. As far as I know, we don't have a single Hessian specialization for fvar<var>.
One way to make these Hessian tests tighter would be to finite diff gradients rather than doubly finite differencing.
We should IMHO test stronger than that (either bitwise equal or EXPECT_DOUBLE_EQ for corresponding values across all instantiations), different instantiation should not be allowed to have non-negligible effects on computation and if they do we should IMHO change the way they are implemented.
I don't understand what you're suggesting here. If you're suggesting that our foo(var, double) and foo(double, double) and foo(var, var) produce bitwise equvalent results, that seems way too strict given that floating point is floating point, not proper real-valued arithmetic.
If such tests are (can be) satisfied, there is IMHO little need to have other tests try all instantiations.
What's the scope of "such tests" here. Can you list the tests you think are sufficient?
|
Description
Using known identities is a powerful way to test precision of our functions. Currently using identities for derivatives is a bit tedious - you need to explicitly compute derivative against each variable on both left and right hand side of the identity and compare them. The proposal is to create a function that does this for you.
Example
The goal is to be able to write something like:
I would not expect
expect_identity
to check this for all possible instantiations, I believe there should be a separate test that all instantiations are numerically equal and after that we should just test with allvar
to make things easier.Also note while I could test that
lh - rh == 0
(with zero gradients all over) this is potentially problematic as the test tolerance could not reflect the magnitude of the actual values.Expected output
It tests that the value and all derivates are close up to a relative tolerance using
expect_near_rel
Current Version:
v3.1.0
The text was updated successfully, but these errors were encountered: