-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Dense VI! #115
Added Dense VI! #115
Conversation
Thanks for this, looking good! The main thing we should think about is the storing of the covariance. Now although it is true that a Cholesky factor does have positive diagonal entries (which has inspired your log transformation) we could instead just learn a lower triangular matrix Let me know if that makes sense. I also wonder if there are some other dense VI implementations we can use to sanity check (I've had a quick scan but not really fruitful). |
…or, updated documentation and variable and function names accordingly.
@SamDuffield Thank you for your feedback. I agree that the positive diagonal constraint need not be enforced, as the end user ought only to care for the covariance matrix, and not the explicit Cholesky running in the background. I have modified the code to remove that constraint and updated the documentation, as well as function and variable names to reflect that the "Cholesky" is not being maintained, but rather an "L_factor." I didn't find any other dense VI code myself either. I just implemented based on my understanding of https://arxiv.org/abs/1601.00670, as well as the diagonal method. Let me know if anything else needs changing. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks for doing it!!
I gave a stab at implementing Dense VI.
The state maintains both the covariance matrix and a flat representation of the nonzero elements of the Cholesky. The log of the diagonal of the Cholesky is taken in this flat representation and later exponentiated to ensure positivity of the diagonal. I added some helper functions to do this in utils since I figure they may be useful elsewhere as well. I also added tests and documentation for these helpers.
If this is too much overhead or if anything needs to be changed, please let me know and I will refactor accordingly. Thank you!