Open
Description
Overview
Currently, we use a deterministic SIR model (see sir
and sim_sir
in models.py) to predict everything. It does not have many parameters, which I think contributes to the ease of use and adoption of the tool... however, accuracy is also of paramount importance. There have been multiple proposed improvements.
Proposed Improvements
- This repo uses MCMC sampling to do more probabilistic models. @sam-qordoba is trying to get the Bayesian SIR model working, but it has obscure requirements. Here is a colab notebook of the main model - the repo works but there are a few setup steps
- Paper suggestion via Google AI: Bayesian Models for Heterogeneous Personalized Health Data - src
- Possibly-useful Transformer model from Google: interpretable multi-horizon forecasting with deep learning, but, " unfortunately for this task at the moment, the amount of data seems very limited and expert human biases (e.g. it takes X days to show symptoms/recover etc.) seem more important." src
- Model should incorporate potential incoming infections from neighboring areas (etc), rather than the assumption of jurisdiction lockdown
Concerns
- "my understanding is the SIR model's more of a guesstimate that can be fit retrospectively but isn't that predictive for changing circumstances. It doesn't account for household contact, or hordes of folks driving their dying relatives from one jam-packed hospital to the next, or the larger consequences of jamming 200 octogenarians into a group home manned by underpaid attendants with a shortage of tests and protective gear. But we're fighting the epidemic blind, so it's what we've got." src
- "There has been a lot of talk about using more complex models, but the hurdles are (1) usability (2) uncertainty/unavailability of the required inputs. I think that the consensus is that better models would be better if they had well-constrianed inputs and didn't make the tool harder for users to adapt to their local contexts. Otherwise better models would be worse." src
- "What are some additional inputs that are missing? Am very interested in modelling with different values for various activities, e.g. school transmission, intrahousehold transmission, workplace transmission. Obviously this is not straightforward. Moreover the SIR model totally leaves out the fact that different populations are more or less vulnerable, if one is looking at hospital capacity there are pretty sizeable regional population variations. For a first order guess it's useful, but the real world is full of cascading impacts that are very hard to guess." src
Definition of Done
This ticket is complete when we have a plan for improving the model, which takes into account the concerns.