-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorise ebola model #211
Conversation
This is how benchmark results would change (along with a 95% confidence interval in relative change) if e99365a is merged into main:
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if e99365a is merged into main:
|
cb817b1
to
d4b31ab
Compare
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 6258c0c is merged into main:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @pratikunterwegs. I've only taken a relatively quick look through this PR but everything seems to work well.
I've left a few comments throughout the diffs. I haven't had time to look properly into the seed management, happy to look again another time, but no need to wait on merging this PR as I can investigate when reviewing future PRs.
sorted = FALSE | ||
) | ||
|
||
# Send warning when > 10,000 runs (including replicates) are requested |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the runtime of 10,000 runs? I struggle to gauge whether this is a sensible threshold to warn users. Is there also a memory usage aspect to this that should also be considered when running many thousand runs, e.g. if a user has a laptop with 1GB of RAM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably depends on hardware - 10,000 feels like a lot even with replicates. If you remember, the Gaza work was 35K runs overall, over 11 pathogens. So 10K is probably a good middle ground where it's not too low, but also a fair warning threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a reprex with a single run of 10k replicates. Probably higher with interventions added. It's not very slow on my hardware, but still worth a warning I'd think.
library(epidemics)
library(ggplot2)
# Prepare population and parameters
demography_vector <- 67000 # small population
contact_matrix <- matrix(1)
# manual case counts divided by pop size rather than proportions as small sizes
# introduce errors when converting to counts in the model code; extra
# individuals may appear
infectious <- 1
exposed <- 10
initial_conditions <- matrix(
c(demography_vector - infectious - exposed, exposed, infectious, 0, 0, 0) /
demography_vector,
nrow = 1
)
pop <- population(
contact_matrix = contact_matrix,
demography_vector = demography_vector,
initial_conditions = initial_conditions
)
# prepare integer values for expectations on data length
time_end <- 100L
replicates <- 10000L
tictoc::tic()
model_ebola(
population = pop,
replicates = replicates,
time_end = time_end
)
#> time demography_group compartment value replicate
#> <int> <char> <char> <num> <int>
#> 1: 0 demo_group_1 susceptible 66989 1
#> 2: 0 demo_group_1 exposed 10 1
#> 3: 0 demo_group_1 infectious 1 1
#> 4: 0 demo_group_1 hospitalised 0 1
#> 5: 0 demo_group_1 funeral 0 1
#> ---
#> 6059996: 100 demo_group_1 exposed 78 10000
#> 6059997: 100 demo_group_1 infectious 107 10000
#> 6059998: 100 demo_group_1 hospitalised 43 10000
#> 6059999: 100 demo_group_1 funeral 6 10000
#> 6060000: 100 demo_group_1 removed 395 10000
tictoc::toc()
#> 44.941 sec elapsed
Created on 2024-04-17 with reprex v2.0.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a warning is useful in this case as it reassures the user that their session isn't hanging and that something is running in the background that may take a substantial amount of time.
Something for a possible future PR, it might be worth restating "This may take some time"
with "This can take several minutes"
, to be clearer to user how long they should expect to wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something for a possible future PR, it might be worth restating "This may take some time" with "This can take several minutes", to be clearer to user how long they should expect to wait.
An alternative would be add a cli progress bar especially since you already import {cli}.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll take a look into this.
@@ -111,20 +112,21 @@ This can also be interpreted as the proportion of funerals that are ebola-safe. | |||
## Run epidemic model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving my comment here as L101 has not changed to make the comment there.
Why are deaths and recovered in the same compartment? It might be worth adding a bit more explanation as to why this is the case other than saying it does not affect model dynamics. What if someone wanted to estimate the total number of deaths from the outbreak?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proximate reason really is that it's because neither really affects the outbreak size any further; the underlying reason is that removals due to deaths and recoveries weren't separated in the Li et al. consensus model if I recall correctly. It should be fairly easy to estimate deaths by scaling the epidemic size by the CFR.
@sbfnk or @adamkucharski - you've actually responded to Ebola - would separating deaths from recoveries be useful? I'm happy to raise an issue for this.
Thanks @joshwlambert for looking through this, you've already managed to catch a few issues that I'll fix (func docs, notes, captions). Will go through the comments and answer there. |
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 6258c0c is merged into main:
|
Thanks @joshwlambert for reviewing. I'm merging this PR now so the next one can be up for review; happy to take more feedback on the Ebola model as issues. |
This PR is WIP on a project to release {epidemics} v0.3.0 focusing on functionality related to the Ebola model.
This PR:
.model_ebola_internal()
; the user facingmodel_ebola()
holds code for input checking, parameter-scenario combination, and model output handling;replicates
argument tomodel_ebola()
with a default of 100 replicates;Note that the continuous benchmarking check is expected to fail (or show reduced performance) as the default for
model_ebola()
is now 100 replicates.A small example showing seed preservation:
Created on 2024-04-08 with reprex v2.0.2