-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store only the covariance matrix in CoxModel
, not the information
#41
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportBase: 99.63% // Head: 99.64% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #41 +/- ##
=======================================
Coverage 99.63% 99.64%
=======================================
Files 6 6
Lines 276 283 +7
=======================================
+ Hits 275 282 +7
Misses 1 1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Currently we're storing both the (observed) Fisher information matrix and its inverse, the variance-covariance matrix, separately in the `CoxModel` object. Since we know the observed information is positive semi-definite, we can instead store a (possibly pivoted) Cholesky factorization of the information. That allows us to easily construct both the original matrix and its inverse without needing to store both. In the case of few parameters (like ≤3) and positive semi-definite information but not positive definite, this will actually use a bit more memory than storing the matrices separately. However, for any more parameters, regardless of whether the information is positive definite, this saves a good amount of memory, with the difference becoming more pronounced with more parameters.
bbe5bee
to
2d96905
Compare
I'm in favor of this but I'm wondering if we should just use |
AFAICT the Fisher information field isn't meaningfully used anywhere, but the covariance matrix field is indeed used for uncertainty. The fields store the Hessian and its inverse after fitting, they aren't being passed anywhere during fitting.
The covariance matrix is currently being computed via |
But with this PR it would fail, no? I'm not sure using pinv is sound when computing a variance estimate so failing might be better. |
No, at least as written it would use a pivoted Cholesky factorization.
Yeah I agree, not sure why it uses |
Would it be enough to add something like rank(M) < min(size(M)...) && throw(ArgumentError("model matrix is not full rank")) to the Your point about only using the field for uncertainty is a very good one; I wonder whether it's even worth storing the decomposition of the Fisher information since we primarily care about the the covariance matrix. We could store that instead, either dense or factorized. Thoughts on that? |
src/cox.jl
Outdated
@@ -119,7 +118,7 @@ StatsAPI.dof(obj::CoxModel) = length(coef(obj)) | |||
|
|||
StatsAPI.dof_residual(obj::CoxModel) = nobs(obj) - dof(obj) | |||
|
|||
StatsAPI.vcov(obj::CoxModel) = obj.vcov | |||
StatsAPI.vcov(obj::CoxModel) = inv(obj.chol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that this one will error in the rank deficient case since
julia> inv(cholesky([0 0; 0 1], Val(true), check=false))
ERROR: SingularException(2)
Stacktrace:
[1] chknonsingular
@ /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/lapack.jl:49 [inlined]
[2] potri!(uplo::Char, A::Matrix{Float64})
@ LinearAlgebra.LAPACK /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/lapack.jl:3118
[3] inv(C::CholeskyPivoted{Float64, Matrix{Float64}})
@ LinearAlgebra /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/stdlib/v1.7/LinearAlgebra/src/cholesky.jl:688
[4] top-level scope
@ REPL[150]:1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohhh okay yeah, I see. And in that case then there's no point in using a pivoted factorization since we primarily care about the inverse. 👍
Could still be worth checking for collinearity ahead of time though.
And store it as a matrix rather than a decomposition
Based on our discussion,
Let me know what you think, @andreasnoack! |
CoxModel
CoxModel
, not the information
@@ -224,6 +228,9 @@ Cox proportional hazard model estimate of coefficients. Returns a `CoxModel` | |||
object. | |||
""" | |||
function StatsAPI.fit(::Type{CoxModel}, M::AbstractMatrix, y::AbstractVector; tol=1e-4, l2_cost=0) | |||
if rank(M) < min(size(M)...) | |||
throw(ArgumentError("model matrix is not full rank; some terms may be collinear")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could add a dropcollinear
keyword argument to mimic GLM. This is probably hot garbage but I think we could implement it naively within the existing framework as
r = rank(M)
if r < min(size(M)...)
if dropcollinear
F = qr(M, ColumnNorm())
pivot = F.jpvt
keepcols = pivot[1:r]
dropcols = pivot[(r + 1):end]
M = M[:, sort(keepcols)]
else
throw(ArgumentError("model matrix is not full rank and `dropcollinear=false`"))
end
end
We could then use dropcols
to either go back and re-add terms with a coefficient of 0 or emit some kind of informative warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, R's coxph
fits the model and keeps around the redundant terms but the result has all NA
for the coefficients and statistics and whatnot for those terms.
Currently we're storing both the (observed) Fisher information matrix and its inverse, the variance-covariance matrix, separately in the
CoxModel
object. We aren't actually using the information matrix for anything though, so we can save space by storing only the covariance matrix. If folks need the information matrix (and indeed, if we extendinformationmatrix(::CoxModel; expected=false)
), we can take the inverse as needed.Alternatively, we could store a Cholesky decomposition of the information (or covariance), which is what this PR originally did.
Additionally, on
main
, the covariance matrix is obtained from the observed information viapinv
, which allows obtaining a covariance matrix even in the case of rank deficiency in the model matrix, which doesn't really make sense. Instead, we can check for rank deficiency ahead of time and error out early before even attempting to fit a model.