Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected number of cells identified by onACID in demo_online_cnmfE.ipynb #1413

Open
raymondwjang opened this issue Oct 16, 2024 · 5 comments

Comments

@raymondwjang
Copy link

Hello,

portaling the discussion post to Issues tab since it seems like Discussions isn't as regularly monitored!

Environment:
1. Operating System (Linux, MacOS, Windows): MacOS
2. Hardware type (x86, ARM..) and RAM: ARM (Apple Silicon M1)
3. Python Version (e.g. 3.9): 3.11
4. Caiman version (e.g. 1.9.12): 1.11.3
5. Which demo exhibits the problem (if applicable): demo_online_cnmfE.ipynb
6. How you installed Caiman (pure conda, conda + compile, colab, ..):
by following the README.md, like

conda install -n base -c conda-forge mamba   # install mamba in base environment
mamba create -n caiman -c conda-forge caiman # install caiman
conda activate caiman  # activate virtual environment
caimanmanager install

8. Details:
i've just tried running demo_online_cnmfE.ipynb for the first time, and the comparison results between batch and online outputs are confusing me. Mainly, the batch process detected over 100 cells from the demo video while the online counterpart only found 6. I did not change any parameter values from default values on the notebook. Is this the expected performance for OnACID or did I do something wrong?

Screenshots below.
onACID Processing Contour Plot:
onACID

Batch Processing Contour Plot:
batch

@sneakers-the-rat
Copy link

sneakers-the-rat commented Oct 17, 2024

i actually can't replicate, but that sort of seems worse?

Running from same conditions (fresh env, never run caiman on this machine before, following instructions in OP, etc.) I get this:

batch (offline) approach:

Screenshot 2024-10-17 at 3 43 43 AM

online

Screenshot 2024-10-17 at 3 35 34 AM

the FoV doesn't look the same even, so i'm not sure what could be different there.

Looking at the tests, it looks like they aren't necessarily unit tests, but actually test whether values that are saved one line previous match what is loaded one line later?

    cnm.save('test_online.hdf5')
    cnm2 = cnmf.online_cnmf.load_OnlineCNMF('test_online.hdf5')
    npt.assert_allclose(cnm.estimates.A.sum(), cnm2.estimates.A.sum())
    npt.assert_allclose(cnm.estimates.C, cnm2.estimates.C)

cnm.save('test_online.hdf5')
cnm2 = cnmf.online_cnmf.load_OnlineCNMF('test_online.hdf5')
npt.assert_allclose(cnm.estimates.A.sum(), cnm2.estimates.A.sum())
npt.assert_allclose(cnm.estimates.C, cnm2.estimates.C)

so if there was some regression at some point it almost certainly wouldn't be caught, and the testing matrix doesn't test against mac/windows so OS differences are also in play.

please advise on what might be causing these discrepancies because this level of inconsistency in results on what should be precisely equal exemplary demo data is a relatively serious concern for papers that use this tool <3

@raymondwjang
Copy link
Author

raymondwjang commented Oct 17, 2024

@sneakers-the-rat oops, sorry. Forgot that I did have to change one parameter because the default value was throwing me an IncompleteRead Error (#1414). The parameter is move_ind in cell 5 in the demo_online_cnmfE.ipynb notebook.

movie_ind = 1   # 0 for avi, 1 for the tif (in case avi loading troubles). --> This variable has been changed from 0 to 1.
fnames = ['msCam13.avi', 'data_endoscope.tif']  # filename to be processed
fnames = [download_demo(fnames[movie_ind])] 

So basically I'm working with data_endoscope.tif instead of msCam13.avi, which explains the different FOV. Sorry about the confusion.

Still exploring a possible explanation. In the notebook, the default parameter for the number of frames used for initialization for OnACID is 300

init_batch = 300                               # number of frames for initialization (presumably from the first file)

Maybe this is an obvious result, but increasing this value does seem to push the OnACID cell identification to look more like the batch processing result (tried dialing this value up to init_batch=2500 on the CaImAn dataset image_YST).

The OnACID paper's result section cites 500 frames for simulated data

Benchmarking on simulated data: To compare to ground truth spike trains, we simulated a 2000 frame dataset taken at 30Hz over a 256×256 pixel wide FOV containing 400 "donut" shaped neurons with Poisson spike trains (see supplement for details). OnACID was initialized on the first 500 frames.

and 1000 for real data.

To initialize our algorithm we use the CNMF algorithm on a short initial batch of data of length Tb, (e.g., Tb = 1000)

Application to in vivo 2p mouse hippocampal data: Next we considered a larger scale (90K frames, 480×480 pixels) real 2p calcium imaging dataset taken at 30Hz (i.e., 50 minute experiment). Motion artifacts were corrected prior to the analysis described below. The online algorithm was initialized on the first 1000 frames

And below are my results with running the image_YST dataset on the same demo_online_cnmfE.ipynb notebook:

Batch Processing (469 cells identified)
image

OnACID (using 500 frames for initialization)
image

OnACID (using 1000 frames for initialization) (177 cells identified)
image

I cannot speak to which one is closer to the ground truth, but the gap between the two seems pretty large assuming the default parameter values for the batch & online processing steps are the same (which is what i'm assuming for the default values in the notebook.) I do acknowledge that the notebook default parameter values may be unsuitable for this specific data, but even then the cell count for one method returning more than 2.5 times the other is surprising to me. And my original question around the default OnACID setting only identifying 6 cells while the batch CaImAn identifies > 100 in one of the demo videos (data_endoscope.tif) persists, as well.

Would really appreciate support here! If there's an explanation here or I've made a mistake somewhere, please let me know. Thanks again.

@raymondwjang raymondwjang changed the title unexpected onACID performance from demo_online_cnmfE.ipynb unexpected number of cells identified by onACID in demo_online_cnmfE.ipynb Oct 17, 2024
@pgunn
Copy link
Member

pgunn commented Oct 17, 2024

I'll start looking into this soon; sorry for the delay (recovering from a surgery).

@raymondwjang
Copy link
Author

No worries! Hope your recovery goes well 🙏

@pgunn
Copy link
Member

pgunn commented Oct 21, 2024

Currently chasing both the regression question and the possibility that the parameters are not great for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants