-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stein variational gradient descent as sampling possibility, for increased performance #6
Comments
This is a 6 page short summary on Stein Variational Gradient Descent:
|
In principle one can use the ASVGD inference, however, one might have to play with the priors a bit to help the algorithm converge. Also, it is not yet recommended to use. What helped in my case to make it converge more robustly:
If you know how to apply SVGD or a similar VI to boost computing time of the posteriors, let us know here. |
General comment: If you are hell bent on using variational methods, consider looking at pyro since that framework originally was conceived for exactly that. SVGD exists, but the implementation claims to be 'basic'. https://pyro.ai/ Any particular reason why variational stuff should be better? People use it to train bayesian neural networks and with those number of parameters sampling becomes unfeasible. But given the few parameters of the current model that should not be a problem. |
We are not hell bent on it. My thought was that variational methods could eventually be parallelized, when we will look at the level of Landkreise. |
One this issue, no one is actively working on. So if someone want to have a look...
Next steps would be use pymc3.SVGD with about 100 particles and try to apply it to the example_bundeslaender, to see whether one gets faster to approximate good results. There one can also test whether theano uses multiprocessing. |
So basically how can people help @jdehning A different question |
In PyMC3, stein variational gradient is already implemented, but it has to be tested how well it works/how biased it is for small number of particles. In addition, the optimal type of optimizer and learning rate has to found out. Eventually it could be tried to reparametrize the model, to get a simpler posterior
Reference for the bias: https://ferrine.github.io/blog/2017/06/04/asvgd-sanity-check/
The text was updated successfully, but these errors were encountered: