Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coxph_forestplot #838
base: main
Are you sure you want to change the base?
coxph_forestplot #838
Changes from 4 commits
74bab7c
aca8220
39e4d42
225b606
47a5b12
a32c121
0c7cd45
785a2cf
75ee4ba
541e505
ca4530a
f77f468
66815c4
76f0f0c
2108681
88e18a7
613be89
8c3d170
b3a7c21
cfdb1cc
f1b648b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we actually redesign the interface and have this function take an AnnData object instead? We'd have to store the coxph results in the AnnData object. This would be much more in line with the rest of the ehrapy experience.
Also ping @eroell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. Thanks. We should do this indeed - storing the results used for the plot in adata.uns I suppose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also need a
key_added
argument then or something to ensure that we don't always overwrite results. But yeah this behavior must be consistent across all of the surv analysis methodsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the kmf plot takes a list of KaplanMeierFitter Objects, so i went with that. in the survival analysis plots, ols is the only one taking an anndata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be consistent about it and I think we should consider modifying all of them to take
AnnData
. It's what the rest of the API doesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, noticed where the idea here came from. Having .tl writing to AnnData, and .pl to read from it should be our go to. It has not been done like that everywhere yet unfortunately, and we strive for this consistency now :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one way to go about this is to store the required table for plotting of the lifeline's fitter object in
adata.uns[key_added]
upon the.tl
call;.uns
seems to me the only slot suitable for this table for the univariate functions such as kaplan_meier.For cox_ph here, as the coefficients are on a per-variable level, I think adding them into the
.var
column would be kinda clean; basically adding into.var
the columns{key_added}_{"coef"}
,{key_added}_{"coef lower 95%"}
,{key_added}_{"coef upper 95%"}
, .... There could also be a boolean{key_added}_{cox_ph_variable}
column, which the plotting function uses to pick only the variables used in the cox ph modelThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, seems that lifelines only draws the 95% errorbars and confidence intervals, did you also find no way to change this @aGuyLearning...?
If not, its something larger, and for now until someone requesting this, can keep like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in our coxph wrapper, we don't let user set the confidence intervalls, so this is the default. can be checked after fitting in the .confidence_intervals_ dataframe of the coxphfitter object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can your test not work with
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using the smaller set we get the follwoing error:
ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please see the following tips in the lifelines documentation: https://lifelines.readthe
so i took the whole mimic_2 set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, makes sense. I think it's okay if we use the smaller subset and filter the warning here https://github.com/theislab/ehrapy/blob/main/pyproject.toml#L132