Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 129 #131

Merged
merged 27 commits into from
Feb 8, 2020
Merged

Issue 129 #131

merged 27 commits into from
Feb 8, 2020

Conversation

adibender
Copy link
Owner

@adibender adibender commented Jan 2, 2020

  • additions:
    • predict.pamm returns predicted hazard/cumulative hazard/survival probability for each subjects survival time in newdata. Takes into account subject specific offset. newdata can be data.frame, is transformed to PED using same settings as for data used during fit.
    • predictSurvProb.pamm S3 extension for pamm objects so that PAMMs can be used with pec package. As for predict.pamm, newdata can be provided as untransformed data

@adibender adibender added this to the Model evaluation support milestone Jan 2, 2020
@adibender adibender self-assigned this Jan 2, 2020
@adibender adibender mentioned this pull request Jan 10, 2020
@adibender
Copy link
Owner Author

Once this is merged, I will push the updates to CRAN.

@adibender
Copy link
Owner Author

ping @fabian-s
Btw, wenn du keine Zeit hast, brauchen wir auch nicht unbedingt eine Review. Denke es sollte schon passen. Die Hauptarbeit wird eh vom pec package gemacht

R/predict.R Outdated Show resolved Hide resolved
R/predict.R Outdated

newdata %>%
group_by(.data[["id"]]) %>%
filter(row_number() == n()) %>%
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why not return the entire dataframe? are there any downstream uses that rely on this being a dataframe with as many rows as observational units instead of as many as "intervals spent under risk"? The final value is pretty useless I think (unless type = "surv_prob", ok....)
  2. suggested change (but see above re "pred"):
if (type != "hazard") {
  newdata <- newdata %>%
    group_by(.data$id) %>%
    mutate(pred = cumsum(.data[["pred"]] * exp(.data[["offset"]])))
}
if (type == "surv_prob") {
  newdata <- newdata %>%
    mutate(pred = exp(-.data[["pred"]]))
}

newdata %>%
  pull(.data[["pred"]]) 

Copy link
Owner Author

@adibender adibender Feb 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I just wanted to provide some S3 method for prediction for consistency. So decided to return a vector of the same length as number of rows in input data (as is the case for most predict methods, e.g. lm, glm, gam). Thought about returning data frame with extra column, but this would replicate the functionality of add_* functions.
Maybe return the E(survival time|covariates)? 🤷‍♂️

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, sorry, obvs you want a vector return not a data.frame....
expected survival time makes a lot of sense for type = "response".
for the others, maybe a list of length n of (step)functions or a list of dataframes cbind(tstart, tstop, estimate) .

Copy link
Owner Author

@adibender adibender Feb 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only Q is how to obtain the expected survival times 😄

  1. calculate hazard (per interval) -> simulate survival times using rpexp -> use mean or median
  2. Calculate survival prob over follow up -> use median survival time (could be NA if S(t|x)>.5 for all t )
  3. Is it possible to calculate E(t|x) analytically, similar to parametric models?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analytically should be possible I think? the density of the pexp isn't to terrible, after all... if not, expected survival should still be ok to compute since expectation of a positive random variable is simply the integral of S(t), which we know how to compute from the fit. practical problem there is what the upper limit of the integral should be (certainly not just maximal follow-up Tmax, but if the hazard estimate in the final interval is small and S(Tmax) is still >>0, extending that last interval for a long time will pull the expectation way up....)

@fabian-s
Copy link
Collaborator

fabian-s commented Feb 4, 2020

sorry, pack ich grade nicht das wirklich im detail anzusehen.
das was ich geshen habe ist kleinkram, sieht sonst super aus und ist sicher ne super wichtige ergänzung mit dem pec-interface... 👍

R/predict.R Outdated
#' @export
predict.pamm <- function(
object,
newdata,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this have a se arg as well (and se.type = c("bla", "delta", "sim"))?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try to implement. Is there a technical term for "bla". I usually call it direct transformation, but maybe there's a term for that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a technical term for "bla"

..comedy gold right here 🤣
direct would seem to be the more professional sounding choice.

@adibender
Copy link
Owner Author

Only fixing #128 and #129 for now. predict.pamm will be added in separate PR (see #136)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants