Error handling: Panics vs Result #41

boxtown · 2017-03-20T21:18:18Z

Currently the responsibility for guarding against exceptional cases (e.g. input not in valid domain, mathematically invalid operations etc) is passed to the user. We panic when an operation does not make mathematical sense (e.g. calculating the cumulative distribution function for discrete distributions at a negative input) which forces users to double check to make sure their inputs are valid. While this results in technically correct and predictable behavior from the API, I'm not sure if it's ergonomic or idiomatic and have been mulling over possibly introducing a Result based API either replacing or in addition to the stricter panic based API. This however warrants some discussion and I would love feedback from the community

The text was updated successfully, but these errors were encountered:

cbreeden · 2017-04-30T04:23:46Z

I think consistency is huge here. I did a cursory glance at other crates and noticed the following:

Quick Survey

STD

f64/f32 operations: No result or panics. NaN, Inf, and -Inf are used appropriately.
Arithmetic operations: panic! in debug on overflow.
Result is used for parsing.
Option is used for TryFrom conversion of numeric types (which can fail, otherwise From).

Rand crate

Distributions panic! for invalid input parameters (negative std, min > max, etc).

Num/Num-Traits

Ratios with zero denominator panic!s
Option is used for fallible type conversions (ie, Float -> BigInt if Float is not finite)
FromPrimitive and ToPrimitive traits use Option

It seems to me that a takeaway could be:

Option should be used for fallible conversions.
Result should be used for parsing from binary or utf8.
panic! for invalid inputs for mathematical operations, including with the initialization of such objects (like Ratio or distributions).
Use NaN, Inf and -Inf for floats as appropriate.
Provide "checked" variations when possible. These methods use an Option. (I suppose under the assumption that only one thing could go wrong; or it's usually assumed that the exact reason for failure isn't that important).

boxtown · 2017-05-01T23:45:22Z

@cbreeden Thanks for the feedback and yes, I agree consistency is key which I feel like overall the API is pretty consistent with the interface it provides. My main concern is whether or not the API as it stands is ergonomic for users. I've taken a page from the rand crate's book in panic!-ing on invalid mathematical inputs which I think is more idiomatic than generating NaNs, I could be wrong however as I have yet to really dogfood my own library. It would be interesting to see or hear if a Result-based API in addition to or in place of the panic-based one garners any interest. It would be a pretty big undertaking but I think one that should be hammered out way before a 1.0-release

cbreeden · 2017-05-02T01:54:41Z

Yeah ergonomics is good question. Some complaints that I've heard (about rust in general) is how difficult it is to write expressive numerical code. When I was using statrs, I was using it to simulate some markov chains. Personally, I had no interest in checking for the validity of my inputs for my distributions (and would rather let it panic, though I do understand why someone would disagree for their use case).

Honestly, I think the default API should panic and having a new_checked(...) API which could use a Results would be ideal. But that's just me talking about distributions, perhaps it's worth discussing other API's as well; like if the beta function should return a Result? Personally, I would probably suggest against this, and would rather see a beta and beta_checked variant.

Provided that the naming scheme is hashed out now, it may not be necessary to do any work before a 1.0 release if the idea is to provide _checked variants on respective operations. Adding new variants wouldn't be backwards incompatible.

boxtown · 2017-05-10T22:48:57Z

@cbreeden I agree and personally am leaning towards a _checked API in addition to the current one as well. Hopefully it's something I can start looking into after 0.7.0 is released

michiel-de-muynck · 2017-06-01T02:12:45Z

We panic when an operation does not make mathematical sense (e.g. calculating the cumulative distribution function for discrete distributions at a negative input)

I agree that it's best to panic if an operation does not make mathematical sense. A beta-distribution is only defined for α > 0 and β > 0, so it's ok to panic if the user supplies α = -4.

However, the cumulative distribution function at the point x is defined (for all distributions, discrete or not) as the probability that a sample is less than (or equal to) x. This makes sense for all x, for all distributions. The cdf of a distribution that can only take positive values is 0 at all negative input values, and that's what statrs should return too. Similarly, the cdf of a Bernoulli distribution at any x > 1 is simply 1, and that's also what statrs should return. The function cdf(x) should never panic for any x and any distribution.

Similarly, the probability density function is also defined for any x and any distribution. If a distribution cannot produce the value x (or get arbitrarily close to it), then the probability density at x is 0. So pdf(x) should also never panic.

Even at infinity, the pdf and cdf are well-defined.: pdf(-∞) = 0, pdf(∞) = 0, cdf(-∞) = 0, cdf(∞) = 1 for all distributions.

The only exception is NaN. I think it's ok to panic when the input is NaN. However, standard mathematical functions like ln(x) don't panic on NaN either, instead they simply return NaN. So it might be better to follow that convention. Perhaps even best of all might be to do a bit of both: panic if the user supplies NaN for a parameter like α or β for the beta distribution, but return NaN if the user calculates cdf(NaN). That way is actually the easiest to implement, as you don't have to do anything. If there are any bounds checks, NaN will automatically fail those checks, but if there aren't and the function is composed of basic mathematical functions, then those will return NaN and so will you.

If you agree with these changes, I can implement them and submit a pull request.

boxtown · 2017-06-01T02:47:27Z

@migi Semantically you are correct and as long as we're consistent I have no issue accepting the changes.

See GitHub issue statrs-dev#41. Don't panic when calculating a pdf, cdf or pmf whenever the input is a value that the distribution cannot attain. The pdf, pmf and cdf are still defined at those inputs and we should return the correct values. This commit also adds a few tests to each distribution to test that pdf(-inf)=0, pdf(inf)=0, cdf(-inf)=0 and cdf(inf)=1. It also uses simple numerical integration to test for continuous distributions that the integral of the pdf is approximately equal to the cdf, and for discrete distributions that the sum of the pmf is equal to the cdf.

boxtown · 2018-01-09T01:36:56Z

This has been resolved by #55 and #56

boxtown added discussion help wanted labels Mar 20, 2017

boxtown added this to the Future milestone Mar 20, 2017

boxtown mentioned this issue Jun 1, 2017

Implement _checked interface #55

Closed

michiel-de-muynck mentioned this issue Jun 2, 2017

Keep calm and carry on #56

Merged

boxtown closed this as completed Jan 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error handling: Panics vs Result #41

Error handling: Panics vs Result #41

boxtown commented Mar 20, 2017

cbreeden commented Apr 30, 2017

boxtown commented May 1, 2017

cbreeden commented May 2, 2017

boxtown commented May 10, 2017

michiel-de-muynck commented Jun 1, 2017 •

edited

Loading

boxtown commented Jun 1, 2017

boxtown commented Jan 9, 2018

Error handling: Panics vs Result #41

Error handling: Panics vs Result #41

Comments

boxtown commented Mar 20, 2017

cbreeden commented Apr 30, 2017

Quick Survey

STD

Rand crate

Num/Num-Traits

boxtown commented May 1, 2017

cbreeden commented May 2, 2017

boxtown commented May 10, 2017

michiel-de-muynck commented Jun 1, 2017 • edited Loading

boxtown commented Jun 1, 2017

boxtown commented Jan 9, 2018

michiel-de-muynck commented Jun 1, 2017 •

edited

Loading