Skip to content

Thoughts on reusing/improving the bounding information #232

Closed
@segasai

Description

@segasai

Hi Josh,

This is definitely not a bug, but just some thoughts on a sampling issues I had recently.

The problem that I've been struggling over last few months features a very complicated posterior in high-ish (11) dimensional space with very narrow and veryb broad features as well as multiple modes.
A constant struggle I have with this is missing parts of the posterior due to either missed modes or incorrect approximation by bounding ellipsoids etc.
A default way of dealing with this is to use large number of live points, but obviously the code doesn't scale really well with n> few thousands. An alternative way would be doing multiple runs and then merge the runs, but as far as I understand that's not really correct if the different runs discover/miss different areas of the posterior.

Also, an annoying feature of doing multiple nested runs is that the bounding information from one set of runs is completely ignored by future runs which seems like a waste.
Also, as far as I understand the rejected samples from the run could also be used to refine the bounding information, because the number of rejected samples is >> number of accepted samples. Alternatively it would be good to be able to at least verify using all the samples in the run (accepted and rejected) that the bounds are correct or ideally adjust the bounds.

I don't have a concrete plan with this, but it would be nice to have

  1. the list of all likelihood evaluations from a run and likelihood values to check after the run.
    I don't think dynesty preserves that. (doing that would probably require storing it on disk)
  2. The way of verifying the bounds using all the samples and bounds from the run. I think the functionality for that kind'a exists, but I'm not sure it's scales well for a million points, and is easily accessible. That would be also a good sanity check in general of bounds.
  3. After the fact updating of bounds using large number of samples collected from 1
  4. Running dynesty with already defined bounds updated from 3. I don't quite know if that's possible already, but I think that's what batch runs are doing AFAIU.

It would be interesting to see what you think of this and whether trying to implement some of this functionality would be useful.

S

Metadata

Metadata

Assignees

Labels

enhancementupgrades and improvements

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions