You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have drafted the idea of adding a new SSE model type to bbr in #735.
This issue will provide a full example of running an SSE analysis to help manage and structure design decisions we still need to make. We opted to refactor existing bootstrap functionality since a lot of the setup (procedurally), managing, and summarization of SSE runs overlapped with bootstraps. However, we need to better understand where they diverge, and what bbr should be responsible for.
Overall Design Decisions
Do we need more flexibility during the setup (e.g., setup_sse_run), or can users handle any case-specific setup beforehand relatively easily?
Try replacing the Simulate data step in the example using an mrgsolve simulation
Inspect the summary object and other function calls. Are we able to access everything we need with relative ease?
Is there anything else we should be capturing and/or making something easier to grab?
Is there a need for post-summary helper functions to perform standard SSE analyses (such as the initial_estimates_compare example)?
PsN does SSE a bit differently, in that it lets you provide "alternative models", while we have less control over the input simulated data. This is a much larger conversation that should likely be discussed in a separate issue in detail. However it would be nice to discuss this at a higher level. Is this a "version 2" thing? How do we think most users expect SSE to work?
# This is a helper function we use in our test suite. It's function is just# to add an MSFO = {x}.MSF to an $EST record.# - Used with bbr::add_simulationadd_msf_opt<-function(mod, msf_path= paste0(get_model_id(mod), ".MSF")){
ctl<-bbr:::get_model_ctl(mod)
mod_path<- get_model_path(mod)
est<-nmrec::select_records(ctl, "est")[[1]]
msf_path_ctl<-bbr:::get_msf_path(mod, .check_exists=FALSE)
if(is.null(msf_path_ctl)){
nmrec::set_record_option(est, "MSFO", msf_path)
nmrec::write_ctl(ctl, mod_path)
}else{
rlang::inform(glue::glue("MSF option already exists: \n - {est$format()}"))
}
return(mod)
}
# Sample Quantiles for a given set of columns in a dataframeget_percentiles<-function(
df,
compare_cols,
probs= c(0.5, 0.025, 0.975),
na.rm=FALSE
){
comp_df<-df %>% dplyr::select({{ compare_cols }})
quantile_fn<-function(x) {
quantile(x, probs=probs, na.rm=na.rm)
}
comp_df<-comp_df %>%
dplyr::reframe(across(.cols= everything(), .fns=quantile_fn)) %>%
t() %>%
as.data.frame() %>%
tibble::rownames_to_column() %>% tibble::as_tibble()
colnames(comp_df) <- c("parameter_names", paste0("p", probs*100))
return(comp_df)
}
Simulate data
# Simulate ----------------------------------------------------------------# New model with MSF saved out# - MSF file is required for bbr::add_sumulation()# - We create a model, ensure it will output estimates to an MSF file, and then# simulate `N_SIM` times.# - This is a working example using `bbr`, though we want to see how inputs could# vary when using mrgsolve. Do we need more flexibility below, or can some of it be# done using `dplyr`?# Define N number of simulationsN_SIM<-200# Starting example model from bbrmodel_dir<- system.file("model/nonmem/basic", package="bbr")
mod1<- read_model(file.path(model_dir, "1"))
# Submit a model we plan to simulatemod2<- copy_model_from(mod1, "2") %>% update_model_id()
mod2<- add_msf_opt(mod2) # would normally be done manually
submit_model(mod2, .mode="local")
# Simulate data - can also test with mrgsolve
add_simulation(mod2, n=N_SIM, .mode="local", .overwrite=T)
sim_data<- nm_join_sim(mod2)
New SSE run
# New SSE Run -------------------------------------------------------------# new SSE run or read in previous runs# mod2 <- read_model(file.path(model_dir, "2"))# sse_run <- read_model(file.path(model_dir, "2-sse"))# Can use `.suffix` to create multiple SSE designs from the same starting model# sse_run <- new_sse_run(mod2, .suffix = "sse-design-1")sse_run<- new_sse_run(mod2, .suffix="sse")
# Set up the SSE run# - This function takes a bbi_nmsse_model (created by a previous new_sse_run() # call) and creates `n` new model objects and re-sampled datasets in a subdirectory. # The control stream found at get_model_path(sse_run) is used as the "template"# for these new model objects, and the new datasets are sampled from the dataset# passed to data. # - See ?setup_sse_run for more detailssse_run<- setup_sse_run(
sse_run,
# Simulation dataset# - Could filter to a specific design here:# data = sim_data %>% dplyr::filter(DESIGN = 1),data=sim_data,
# Stratification columns for samplingstrat_cols="SEX",
# N simulationsn=N_SIM,
# Sample size for each dataset (uses ID column as KEY)sample_size=30,
# Simulation replicate column name (e.g., "IREP" for mrgsolve)# - Filters to each simulation before sampling.sim_col="nn"
)
# Print to console to view SSE specificationssse_run
Submit and get status
# Submit in batches
submit_model(sse_run, .batch_size=100)
# Get status of run completion along the way
get_model_status(sse_run)
Summarize and save results
# Summarize the parameter estimates, run details, and any heuristics of a SSE run,# saving the results to a `sse_summary.RDS` data file within the SSE run directory.# - See ?summarize_sse_run() for more detailssse_sum<- summarize_sse_run(sse_run)
# Print to console to view high level information about the runsse_sum# You can look at different summary tables within this objectsse_sum$analysis_summarysse_sum$run_detailssse_sum$run_heuristics# Read in SSE estimates. Faster once it's been summarized above^
sse_estimates(sse_run) # same as sse_sum$parameter_estimates# Summary log for each run# You can also find this information in sse_sum # (e.g., sse_sum$analysis_summary has OFV's, termination codes, etc.)
summary_log(sse_run$absolute_model_path)
# Compare to initial or "true" estimates the SSE run is based on# - initial_estimates_compare() is a prototype function in bbr
initial_estimates_compare(sse_sum, probs= c(0.5, 0.025, 0.975))
# Look at OFV distribution# - get_percentiles() is a helper defined in this doc
get_percentiles(sse_sum$analysis_summary, compare_cols="ofv")
# Read in all SSE models - can be helpful for inspecting specific model runssse_mods<- get_sse_models(sse_run)
length(sse_mods)
model_summary(sse_mods[[1]]) %>% param_estimates()
We have drafted the idea of adding a new SSE model type to
bbr
in #735.This issue will provide a full example of running an SSE analysis to help manage and structure design decisions we still need to make. We opted to refactor existing bootstrap functionality since a lot of the setup (procedurally), managing, and summarization of SSE runs overlapped with bootstraps. However, we need to better understand where they diverge, and what
bbr
should be responsible for.Overall Design Decisions
setup_sse_run
), or can users handle any case-specific setup beforehand relatively easily?Simulate data
step in the example using anmrgsolve
simulationinitial_estimates_compare
example)?PsN
does SSE a bit differently, in that it lets you provide "alternative models", while we have less control over the input simulated data. This is a much larger conversation that should likely be discussed in a separate issue in detail. However it would be nice to discuss this at a higher level. Is this a "version 2" thing? How do we think most users expect SSE to work?Sub-issues
Example
Install development bbr from commit
helper functions
Simulate data
New SSE run
Submit and get status
Summarize and save results
Example plots (post summary)
The text was updated successfully, but these errors were encountered: