Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Dense VI! #115

Merged
merged 17 commits into from
Nov 5, 2024
Merged

Added Dense VI! #115

merged 17 commits into from
Nov 5, 2024

Conversation

jcqcai
Copy link
Contributor

@jcqcai jcqcai commented Oct 24, 2024

I gave a stab at implementing Dense VI.

The state maintains both the covariance matrix and a flat representation of the nonzero elements of the Cholesky. The log of the diagonal of the Cholesky is taken in this flat representation and later exponentiated to ensure positivity of the diagonal. I added some helper functions to do this in utils since I figure they may be useful elsewhere as well. I also added tests and documentation for these helpers.

If this is too much overhead or if anything needs to be changed, please let me know and I will refactor accordingly. Thank you!

@SamDuffield SamDuffield added the new method New algorithm label Oct 26, 2024
@SamDuffield
Copy link
Contributor

Thanks for this, looking good!

The main thing we should think about is the storing of the covariance.

Now although it is true that a Cholesky factor does have positive diagonal entries (which has inspired your log transformation) we could instead just learn a lower triangular matrix $L$ without any positivity constraints that would still lead to a symmetric positive definite covariance $L L^T$. It wouldn't be the unique Cholesky factor of the covariance, but I don't think this is a big deal at all and would allow us to clean up the code a fair bit (I think).

Let me know if that makes sense. I also wonder if there are some other dense VI implementations we can use to sanity check (I've had a quick scan but not really fruitful).

@jcqcai
Copy link
Contributor Author

jcqcai commented Oct 26, 2024

@SamDuffield Thank you for your feedback.

I agree that the positive diagonal constraint need not be enforced, as the end user ought only to care for the covariance matrix, and not the explicit Cholesky running in the background. I have modified the code to remove that constraint and updated the documentation, as well as function and variable names to reflect that the "Cholesky" is not being maintained, but rather an "L_factor."

I didn't find any other dense VI code myself either. I just implemented based on my understanding of https://arxiv.org/abs/1601.00670, as well as the diagonal method.

Let me know if anything else needs changing. Thank you!

posteriors/vi/dense.py Outdated Show resolved Hide resolved
Copy link
Contributor

@SamDuffield SamDuffield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for doing it!!

@SamDuffield SamDuffield merged commit 6af158b into normal-computing:main Nov 5, 2024
2 checks passed
@SamDuffield SamDuffield mentioned this pull request Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new method New algorithm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants