Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RealVariableProcessor--CODA plot excludes observations #4

Open
jewellsean opened this issue May 8, 2014 · 4 comments
Open

RealVariableProcessor--CODA plot excludes observations #4

jewellsean opened this issue May 8, 2014 · 4 comments
Assignees
Labels

Comments

@jewellsean
Copy link
Collaborator

One potential spot for confusion in reading CODA plots generated by blang arises when the number of samples N on the histogram does not match the nMCMCSweeps. This is a side effect of writing the CODA files on iterations which are a power of 2. In particular, the last iterations are not written to file (interval is bigger than current, but current does not reach interval by mcmc completion). For large nMCMCSweeps the difference is rather large (2^n grows quickly).

An easy fix is to write to file for every iteration st mcmcIteration modulo thinningPeriod = 0. That is, when the processor is called. It does not look like the processor has access to nMCMCSweeps, and I am not convinced this is a detail we want to pass. One other alternative might be to force a final write on the last iteration.

@jewellsean jewellsean added the bug label May 8, 2014
@jewellsean
Copy link
Collaborator Author

Profiling some test examples has shown that binc.Command.callWithInputStreamContents is called frequently by this processor. It turns out that RealVariableProcessor could likely be more efficient. Currently, each sample is written to an output file, then on exponential passes the CODA files are generated and then plotted.

This feature is likely useful for debugging or to retrieve preliminary results before the sampler is finished, but I think we should have the ability to turn off writing until the end. For cross product parameter runs, this could add a significant amount of computation time.

I propose to include in context the number of total iterates, and the option described above.

@jewellsean jewellsean self-assigned this May 17, 2014
@junseonghwan
Copy link
Collaborator

Hey Sean,

I'm not sure if it's your commit, but after I sync'd today it looks like I am getting a compilation error in blang.validation.CheckStationarity at line 353 (in collectStatistics method).

Can you verify if you also get the compilation error?

Thanks.

jewellsean added a commit that referenced this issue May 18, 2014
…ion to RealVariableProcessor to effectively turn off CODA plotting for stationary testing.
@jewellsean
Copy link
Collaborator Author

Thanks for reminding me to check newly produced code against all of the JUnit tests via gradle build. The changes were pretty minimal despite a few files changing.

Importantly, for all distribution tests, you now need to add a line in the test:

algo.options.CODA = false; 

which tells the processor not to generate the CODA plots. Otherwise, the test could take an incredibly long time to complete due to the processor consistently plotting on the last iterate. If anyone has any comments or suggestions please let me know.

@alexandrebouchard
Copy link
Collaborator

The fix does not seem to work universally. See for example TestExpFamPhyloModel (in conifer), which sets algo.options.CODA to false when ran as a test unit, but plots are still created.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants