Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with readme and codes #1

Open
JingyangKe opened this issue Jun 5, 2022 · 2 comments
Open

Problems with readme and codes #1

JingyangKe opened this issue Jun 5, 2022 · 2 comments

Comments

@JingyangKe
Copy link
Contributor

JingyangKe commented Jun 5, 2022

Hello, I am trying to reproduce the figures and encounter some problems:

  1. The readme says "The lapse model can be fit at any time", but I find that glm-hmm/2_fit_models/fit_global_glmhmm/2_apply_post_processing.py requires lapse_model_params_one_param.npz and lapse_model_params_two_param.npz. So maybe we have to follow this order: fit_glm -> fit_lapse_model -> fit_global_glmhmm -> fit_individual_glmhmm?
  2. I notice that in the cluster array of fit_global_glmhmm, there are 20 iterations for each fold and each K. I want to double check if I am supposed to run all 20 iterations here as those take a long time to run.
  3. The readme says to reproduce figures for another two datasets, "replace the IBL URL in the preprocessing pipeline scripts" and rerun the fit models part. However, after downloading the another two datasets, I find their structure is quite different from the IBL one and doesn't seem to be compatible with current preprocess script. Therefore, I am wondering if you can release the preprocess scripts for another two dataset.

Would really appreciate if you can help me with those problems!

@zashwood
Copy link
Owner

zashwood commented Jun 5, 2022

Hi Jingyang, thanks for your interest in using the code, and for your message. To respond to each of your points:

  1. That's a good point about the lapse model fits being called in the post-processing script. While the GLM has to be fit before the global GLM-HMM (due to being used to initialize the global GLM-HMM), the lapse model fits are only used for model comparison purposes - hence my comment about being able to run the lapse model code at any time. But I agree that, given the current version of the post-processing code, the lapse model should be run either before or after the global GLM-HMM so that this script doesn't throw an error. I just updated the README, so hopefully this is clearer now.
  2. Yes - for the global fit, we use 20 initializations for each fold-K combination. While I think it would be acceptable to run the fits for fewer folds, I do think that it's important to use multiple initializations, so as to prevent yourself from getting stuck in a local minimum during fitting. I agree that this is computationally expensive, and we typically ran our code on a cluster, launching a separate job for each initialization-K-fold combination.
  3. I'll see what I can do re: releasing the preprocessing code for the other two datasets. Given that the analyses we apply to the other two datasets are so similar to those for the IBL dataset, I didn't think it would be of sufficient interest to release the whole pipeline for these datasets too, and I thought having these additional scripts might clutter the repo. But I'll do my best to clean up the preprocessing scripts, and add them to the 1_preprocess_data directory.

@JingyangKe
Copy link
Contributor Author

Thanks a lot for your clarification!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants