-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binomial distribution #66
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,4 +19,5 @@ pom.xml | |
!template/pom.xml | ||
pom.xml.asc | ||
node_modules | ||
**.shadow-cljs | ||
**.shadow-cljs | ||
.#* |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -97,6 +97,46 @@ | |
{:pre [(<= 0 p 1)]} | ||
(Math/log (if v p (- 1.0 p)))) | ||
|
||
(defn log-fact | ||
"Returns the natural logarithm of `x` factorial." | ||
[x] | ||
{:pre [(>= x 0)]} | ||
(log-gamma-fn (inc x))) | ||
|
||
(defn log-bico | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. let's make this private, or add There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would actually make this a local fn with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. totally. i made |
||
"Returns the natural logorithm of the binomial coefficient, `n` choose `k`." | ||
[n k] | ||
{:pre [(integer? n) | ||
(integer? k) | ||
(>= k 0) | ||
(>= n k)]} | ||
(if (or (zero? k) (= k n)) | ||
0 ;; log 1 | ||
(- (log-fact n) (log-fact k) (log-fact (- n k))))) | ||
|
||
(defn binomial | ||
"Returns the log-likelihood of a [Binomial | ||
distribution](https://en.wikipedia.org/wiki/Binomial_distribution) | ||
parameterized by `n` (number of trials) and `p` (probability of success in | ||
each trial) at the value `v` (number of successes)." | ||
[n p v] | ||
{:pre [(integer? n) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think in ClojureScript, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah not sure, but it doesn't look like cljs checks for in emmy i don't see an https://github.com/mentat-collective/emmy/blob/main/src/emmy/value.cljc BUT if i understand the compatibility layer between clj and cljs, if we use integer? in clojure code and then compile it to cljs, the cljs version of integer? should be used when the code runs in a js environment. if that's the case, then it's plausible that this is the right predicate to use? |
||
(integer? v) | ||
(>= v 0) | ||
(>= n v) | ||
(<= 0 p 1)]} | ||
(cond | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you might use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh nice, done. |
||
(= p 0) (if (= v 0) | ||
0.0 ;; log(1) | ||
##-Inf) ;; log(0) | ||
(= p 1) (if (= v n) | ||
0.0 ;; log(1) | ||
##-Inf) ;; log(0) | ||
:else | ||
(+ (log-bico n v) | ||
(* v (Math/log p)) | ||
(* (- n v) (Math/log (- 1 p)))))) | ||
|
||
(defn cauchy | ||
"Returns the log-likelihood of a [Cauchy | ||
distribution](https://en.wikipedia.org/wiki/Cauchy_distribution) parameterized | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -59,6 +59,87 @@ | |
(Math/exp (dist/logpdf (->bernoulli p) (not v))))) | ||
"All options sum to 1"))) | ||
|
||
(defn binomial-tests [->binomial] | ||
;; boundaries... | ||
(testing "when p = 0 and v = 0, probability is 1, log(1) = 0" | ||
(is 0 (dist/logpdf (->binomial 10 0) 0))) | ||
|
||
(testing "when p = 0 and v > 0, probability is 0, log(0) = -Inf" | ||
(is ##-Inf (dist/logpdf (->binomial 10 0) 1))) | ||
|
||
(testing "when p = 1 and v = n, probability is 1, log(1) = 0" | ||
(is 0 (dist/logpdf (->binomial 10 1) 10))) | ||
|
||
(testing "when p = 1 and v < n, probability is 0, log(0) = -Inf" | ||
(is ##-Inf(dist/logpdf (->binomial 10 0) 1))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. whoops, missing a space There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. fixed. should There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. looks like it's still there, I'm not sure, maybe it should? |
||
|
||
;; properties... | ||
(testing "sum of probabilities equals 1" | ||
(with-comparator (within 1e-9) | ||
(let [n 100 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how about making this a generative test? See the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great idea. i changed a bunch of tests over to generative. done. |
||
p 0.5 | ||
log-probs (map (fn [k] (dist/logpdf (->binomial n p) k)) (range 0 (inc n))) | ||
probs (map (fn [x] (Math/exp x)) log-probs) | ||
sum-probs (reduce + probs)] | ||
(is (ish? 1.0 sum-probs))))) | ||
|
||
(testing "symmetric when p = 0.5 such that binomial(k) = binomial(n -k)" | ||
(let [n 100 | ||
p 0.5 | ||
v 10] | ||
(is (dist/logpdf (->binomial n p) v) | ||
(dist/logpdf (->binomial n p) (- n v))))) | ||
|
||
(testing "mean and variance consistency where mu = n * p and variance = mu(1 - p)" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you say more, even if just in comments below, about what you are testing here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, i added comments with more context. the basic idea is to spot check a couple of well known properties of the distribution. if that seems weird though i can leave them out for sure. |
||
(with-comparator (within 1e-9) | ||
(let [n 100 | ||
p 0.3 | ||
ks (range 0 (inc n)) | ||
log-probs (map (fn [k] (dist/logpdf (->binomial n p) k)) ks) | ||
probs (map (fn [x] (Math/exp x)) log-probs) | ||
mu (reduce + (map * probs ks)) | ||
variance (reduce + (map (fn [k p] (* p (Math/pow (- k mu) 2))) ks probs)) | ||
theoretical-mu (* n p) | ||
theoretical-variance (* n p (- 1 p))] | ||
(is (ish? theoretical-mu mu)) | ||
(is (ish? theoretical-variance variance))))) | ||
|
||
(testing "spot check against scipy.stats.binom.logpmf (v1.12.0)" | ||
(with-comparator (within 1e-9) | ||
(is (ish? -7.13354688230902 (dist/logpdf (->binomial 1000000 0.5) 500000))) | ||
|
||
;; TODO: failing test (off by 1.9e-9) | ||
;; expected: (ish? -3.222306954272568 (dist/logpdf (->binomial 1000000 0.0001) 100)) | ||
;; actual: (not (ish? -3.222306954272568 -3.2223069561241857)) | ||
(is (ish? -3.222306954272568 (dist/logpdf (->binomial 1000000 0.0001) 100))) | ||
|
||
(is (ish? -8.047189562170502 (dist/logpdf (->binomial 5 0.2) 5))) | ||
(is (ish? -1.1856136373815076 (dist/logpdf (->binomial 50 0.99) 49))) | ||
(is (ish? -1.185613637381508 (dist/logpdf (->binomial 50 0.01) 1))) | ||
(is (ish? -693133.3650493873 (dist/logpdf (->binomial 1000000 0.5) 999999))) | ||
(is (ish? 0 (dist/logpdf (->binomial 10 0) 0))) | ||
(is (ish? 0 (dist/logpdf (->binomial 10 1) 10))) | ||
(is (ish? -2.02597397686619 (dist/logpdf (->binomial 100 0.9) 90))) | ||
(is (ish? -52.680257828913156 (dist/logpdf (->binomial 500 0.1) 0))))) | ||
|
||
(testing "spot check against gen logpdf (v0.4.6)" | ||
(with-comparator (within 1e-9) | ||
(is (ish? -7.133546882067904 (dist/logpdf (->binomial 1000000 0.5) 500000))) | ||
|
||
;; TODO: failing test (off by 1.9e-9) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. delete these if unnecessary There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. i dialed down the precision threshold from |
||
;; expected: (ish? -3.222306954262436 (dist/logpdf (->binomial 1000000 0.0001) 100)) | ||
;; actual: (not (ish? -3.222306954262436 -3.2223069561241857)) | ||
(is (ish? -3.222306954262436 (dist/logpdf (->binomial 1000000 0.0001) 100))) | ||
|
||
(is (ish? -8.047189562170502 (dist/logpdf (->binomial 5 0.2) 5))) | ||
(is (ish? -1.185613637381516 (dist/logpdf (->binomial 50 0.99) 49))) | ||
(is (ish? -1.1856136373815152 (dist/logpdf (->binomial 50 0.01) 1))) | ||
(is (ish? -693133.3650493873 (dist/logpdf (->binomial 1000000 0.5) 999999))) | ||
(is (ish? 0 (dist/logpdf (->binomial 10 0) 0))) | ||
(is (ish? 0 (dist/logpdf (->binomial 10 1) 10))) | ||
(is (ish? -2.025973976866184 (dist/logpdf (->binomial 100 0.9) 90))) | ||
(is (ish? -52.680257828913156 (dist/logpdf (->binomial 500 0.1) 0)))))) | ||
|
||
(defn categorical-tests [->cat] | ||
(checking "map => categorical properties" | ||
[p (gen-double 0 1)] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no way, awesome! Is this a thing? I don't see it in the javadocs... https://docs.oracle.com/javase/8/docs/api/java/util/SplittableRandom.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sike! i just yanked out the binomial stuff from java-util. i'll follow up with a separate PR to get samples going. then we get benchmark it against kixi and commons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i also added
binomial-gf-tests
that generatively samples from a distribution, which would have caught.nextBinomial
)