Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binv remodel #1486

Closed
wants to merge 6 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ You may also find the [Upgrade Guide](https://rust-random.github.io/book/update.
- Rename `rand::distributions` to `rand::distr` (#1470)
- The `serde1` feature has been renamed `serde` (#1477)
- Mark `WeightError`, `PoissonError`, `BinomialError` as `#[non_exhaustive]` (#1480).
- Refactor inverse `Binomial` algorithm to permit for n > i32::MAX values (#1486).

## [0.9.0-alpha.1] - 2024-03-18
- Add the `Slice::num_choices` method to the Slice distribution (#1402)
Expand Down
57 changes: 52 additions & 5 deletions rand_distr/src/binomial.rs
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ impl Distribution<u64> for Binomial {

let result;
let q = 1. - p;
let np = (self.n as f64) * p;

// For small n * min(p, 1 - p), the BINV algorithm based on the inverse
// transformation of the binomial distribution is efficient. Otherwise,
Expand All @@ -136,19 +137,65 @@ impl Distribution<u64> for Binomial {
// Ranlib uses 30, and GSL uses 14.
const BINV_THRESHOLD: f64 = 10.;

// This threshold is when powi outperforms the .exp() .ln() method.
// However it's constrained by i32::MAX from powi and performs worse above this threshold.
// This value can likely be more finely optimized, but should be done across multiple hardware and in a more controlled setting.
// It's also such an edge case that very few people are likely to benefit from it.
const SMALL_NP_THRESHOLD: f64 = 1e-10;

// Same value as in GSL.
// It is possible for BINV to get stuck, so we break if x > BINV_MAX_X and try again.
// It would be safer to set BINV_MAX_X to self.n, but it is extremely unlikely to be relevant.
// When n*p < 10, so is n*p*q which is the variance, so a result > 110 would be 100 / sqrt(10) = 31 standard deviations away.
const BINV_MAX_X: u64 = 110;

if (self.n as f64) * p < BINV_THRESHOLD && self.n <= (i32::MAX as u64) {
let mut r: f64;
if self.n == 1 {
// Use the BINV algorithm for special case n = 1 (simplify r calculations).
let s: f64 = p / q;

result = 'outer: loop {
r = q;
let mut u: f64 = rng.random();
let mut x = 0;

while u > r {
u -= r;
x += 1;
if x > BINV_MAX_X {
continue 'outer;
}
r *= (((2 - x) as f64) * s) / (x as f64);
}
break x;
}
} else if np < SMALL_NP_THRESHOLD && self.n <= (i32::MAX as u64) {
// For very small n*p the powi is superior.
// Use the BINV algorithm.
let s: f64 = p / q;

result = 'outer: loop {
r = q.powi(self.n as i32);
let mut u: f64 = rng.random();
let mut x = 0;

while u > r {
u -= r;
x += 1;
if x > BINV_MAX_X {
continue 'outer;
}
r *= (((self.n - x + 1) as f64) * s) / (x as f64);
}
break x;
}
} else if np < BINV_THRESHOLD {
// For everything else r = (q.ln() * (self.n as f64)).exp() is superior.
// Use the BINV algorithm.
let s = p / q;
let a = ((self.n + 1) as f64) * s;
let s: f64 = p / q;

result = 'outer: loop {
let mut r = q.powi(self.n as i32);
r = (q.ln() * (self.n as f64)).exp();
let mut u: f64 = rng.random();
let mut x = 0;

Expand All @@ -158,7 +205,7 @@ impl Distribution<u64> for Binomial {
if x > BINV_MAX_X {
continue 'outer;
}
r *= a / (x as f64) - s;
r *= (((self.n - x + 1) as f64) * s) / (x as f64);
}
break x;
}
Expand Down