Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement fit in Stan; fully Bayesian (including sigma); different smoothing. #34

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
10e3df4
Adding environment.yml for conda.
Apr 22, 2020
7a3d4a2
Adding seaborn to environment.yml.
Apr 22, 2020
b8caa05
Adding .gitignore.
Apr 22, 2020
0e7a036
Deleting matplotlibrc.
Apr 22, 2020
5b2556b
Adding LICENSE.
Apr 22, 2020
f601cca
Added README.md.
Apr 22, 2020
73fa498
Adding link to rt.live to README.md.
Apr 22, 2020
c7155a4
Adding Stan code for model.
Apr 22, 2020
90d9b12
Back to Python 3.7 due to Mac OS X multiprocessing failures; adding t…
Apr 22, 2020
ceb36b8
NY single-state example works.
Apr 22, 2020
63dc817
All state fits done, but disagree with rt.live. Their bug or mine?
Apr 22, 2020
00e019c
Major update, plus running on today's data.
Apr 23, 2020
214d18e
Some better comments at top of notebook.
Apr 23, 2020
5f413e7
Merge remote-tracking branch 'upstream/master'
Apr 23, 2020
3b55db9
Better re-try options for fit.
Apr 24, 2020
a2dcbdb
Move to JHU data, update results.
Apr 24, 2020
22c277e
Major performance improvements via re-parameterization to take accoun…
Apr 25, 2020
2f10432
Merge remote-tracking branch 'upstream/master'
Apr 25, 2020
a756d31
Moved to causal exponential smoothing, based on k-sys/covid-19#30.
Apr 25, 2020
5a2ee04
Changed smoothing timescale to actual numbers based on delay times.
Apr 25, 2020
7093f6c
Moved back to gaussian smoothing.
Apr 25, 2020
5b0bde9
WIP: trying to get good sampling---problem with samlping in Rt space.
Apr 26, 2020
fb96a4c
Ready for run on Rusty---sampling is taking longer than I would like.
Apr 27, 2020
db910d8
Sampling finished.
Apr 27, 2020
2b64198
Moving to binomial model.
Apr 28, 2020
6ce3944
Run on today's data using binomial model, but it's *ugly*.
Apr 28, 2020
34a75a5
More commentary in notebook.
Apr 28, 2020
ddd1e77
Saving binomial model in notebook, but returning to Poisson model bec…
Apr 30, 2020
34e9dcf
Back to Poisson because total testing numbers are so error-filled.
Apr 30, 2020
43a462f
Run nightly.
May 1, 2020
50c1a59
Moved to negative binomial from Poisson; not much difference.
May 2, 2020
2e69458
Moving away from neg_binomial, also tighter prior on sigma.
May 3, 2020
0ae99fd
Account for log_sum_exp floor in Rt likelihood s.d. estimate.
May 3, 2020
561ae2b
Change scale in log_sum_exp linearization for Rt in likelihood-domina…
May 3, 2020
147e5ca
Start off the Rt_raw series with exp_cts[1] = L0 instead of k[1].
May 3, 2020
a3bed8c
Re-run on nightly data. Bleak picture now that sigma is more restric…
May 3, 2020
9aa9beb
Daily.
May 4, 2020
f5afc55
Nightly.
May 5, 2020
1730e99
WIP, adding daily adjustment factors and want to test on laptop.
May 7, 2020
22e2c0a
Nightly---a daily adjustment to the timeseries didn't work out.
May 7, 2020
2caf396
Nightly.
May 8, 2020
72f8887
Now show positive daily tests as well as R_t estimates.
May 8, 2020
9541f37
Update.
May 18, 2020
f6a47fd
Moving to binomial model. Key realization: smooth pos and neg timese…
May 19, 2020
d2de618
Sampling using binomial model. Curves suck.
May 19, 2020
32a1784
Moving to negative binomial model with cutoff at phi = 100.
May 29, 2020
aadfbf8
Don't need total counts in model any more.
May 29, 2020
108a03c
Latest test data.
May 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.ipynb_checkpoints
23 changes: 23 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
MIT License

This license applies to every file *except* Realtime R0.ipynb.

Copyright (c) 2020 Will M. Farr <[email protected]>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
116 changes: 116 additions & 0 deletions R0.stan
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
data {
int ndays;
int k[ndays]; /* Positive tests */

/* Parameters for the marginalization over serial time */
real tau_mean;
real tau_std;

/* sigma is given an N(0, scale) prior. */
real sigma_scale;
}

transformed data {
/* Parameters for a log-normal distribution on the serial time that match the
/* given mean and s.d. */
real tau_mu = log(tau_mean / sqrt(1.0 + tau_std^2/tau_mean^2));
real tau_sigma = sqrt(log1p(tau_std^2/tau_mean^2));

/* The parameter phi controls the "excess variance" of the negative binomial
/* over Poisson. For the negative binomial distribution, we have

<X> = mu

Var(x) = mu + mu^2/phi = mu*(1 + mu/phi)

So that for mu << phi the distribution behaves "like Poisson;" but for mu >>
phi we have

sqrt(Var(x))/<X> = 1/sqrt(phi)

that is: the *fractional* uncertainty asymptotes to the same fractional
uncertainty as Possion with phi counts, no matter how large mu grows.

Here we choose phi = 1000, semi-arbitrarily. This limits the relative
uncertainty in the positive rate to ~few percent on any given day.

*/
real phi = 100.0;
}

parameters {
/* Serial time (days) */
real<lower=0> tau;

/* stepsize of the random walk in log(Rt) */
real<lower=0> sigma;

/* First day's expected number of infections. */
real L0_raw;

/* Will be transformed into Rt. */
real Rt_raw[ndays-1];
}

transformed parameters {
real L0;
real log_jac_L0;
real Rt[ndays-1];
real log_jacobian;

L0 = (k[1]+1)*exp(L0_raw/sqrt(k[1]+1));
log_jac_L0 = log(L0) - 0.5*log(k[1]+1);

/* Here we transform the raw variables into Rt following the AR(1) prior;
because we are using a negative binomial observational model, as long as
the value of sigma is comparable to 1/sqrt(phi) ~ 0.03 (i.e. as long as the
user puts in a small sigma_scale), then the prior is comparable to the
likelihood for each observation, and we will be approximately uncorrelated. */
{
real log_jac[ndays-1];

for (i in 2:ndays) {
if (i == 2) {
Rt[i-1] = 3*exp(2.0/3.0*Rt_raw[i-1]);
log_jac[i-1] = log(Rt[i-1]) + log(2.0/3.0);
} else {
Rt[i-1] = Rt[i-2]*exp(sigma*Rt_raw[i-1]);
log_jac[i-1] = log(Rt[i-1]) + log(sigma);
}
}

log_jacobian = sum(log_jac);
}
}

model {
real ex_cts[ndays];

/* Prior on serial time is log-normal with mean and s.d. matching input */
tau ~ lognormal(tau_mu, tau_sigma);

/* Prior on sigma, supplied by the user. */
sigma ~ normal(0, sigma_scale);

/* Initial log-odds given wide prior */
L0 ~ lognormal(log(10), 1);
target += log_jac_L0;

/* The AR(1) process prior; we begin with an N(3,2) prior on Rt based on
Chinese studies at the first sample, and then increment according to the
AR(1) process. Above, we have computed the Jacobian factor between Rt and
rate_raw (which we sample in). */
Rt[1] ~ lognormal(log(3), 2.0/3.0);
for (i in 2:ndays-1) {
Rt[i] ~ lognormal(log(Rt[i-1]), sigma);
}
target += log_jacobian;

ex_cts[1] = L0;
for (i in 2:ndays) {
ex_cts[i] = ex_cts[i-1]*exp((Rt[i-1]-1.0)/tau);
}

/* Negative binomial likelihood for the counts on each day. */
k ~ neg_binomial_2(ex_cts, phi);
}
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Fitting the time-dependent transmission rate for COVID-19 from state-level data. Original example from https://github.com/k-sys/covid-19, updated by Will M. Farr for a fully-Bayesian model using [Stan](http://mc-stan.org).

Results of the original fit presented at https://rt.live/ .
1,108 changes: 1,108 additions & 0 deletions Stan R0.ipynb

Large diffs are not rendered by default.

13 changes: 13 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: covid-19
channels:
- conda-forge
- defaults
dependencies:
- pystan
- jupyterlab
- seaborn
- arviz
- scipy
- pandas
- python=3.7
- tqdm
Loading