From 04c3533347083263768ca1fe673e4cb4e677d42b Mon Sep 17 00:00:00 2001
From: EricThomson <thomson.eric@gmail.com>
Date: Sun, 22 Oct 2023 09:26:32 -0400
Subject: [PATCH] finished initial draft

---
 demos/notebooks/demo_pipeline_cnmfE.ipynb | 223 +++++++++-------------
 1 file changed, 95 insertions(+), 128 deletions(-)

diff --git a/demos/notebooks/demo_pipeline_cnmfE.ipynb b/demos/notebooks/demo_pipeline_cnmfE.ipynb
index 49c852dfe..92a7d2bf1 100644
--- a/demos/notebooks/demo_pipeline_cnmfE.ipynb
+++ b/demos/notebooks/demo_pipeline_cnmfE.ipynb
@@ -22,7 +22,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Imports and general setup"
+    "# Imports and general setup"
    ]
   },
   {
@@ -32,6 +32,7 @@
    "outputs": [],
    "source": [
     "import cv2\n",
+    "import glob\n",
     "from IPython import get_ipython\n",
     "import logging\n",
     "import matplotlib.pyplot as plt\n",
@@ -96,7 +97,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Select file(s) to be processed\n",
+    "# Select file(s) to be processed\n",
     "Here, we analyze the data in `data_endoscope.tif`. The `download_demo` function will download the  file for you and return the complete path to the file which will be stored in your `caiman_data` directory. If you adapt this demo for your data make sure to pass the complete path to your file. \n",
     "\n",
     "Note that the memory requirement of the CNMF-E algorithm are much higher compared to the standard CNMF algorithm. You should test your system before trying to process very large amounts of data."
@@ -115,7 +116,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Load and visualize raw data\n",
+    "# Load and visualize raw data\n",
     "We visualize using the built-in movie object, which is described in detail in `demo_pipeline.ipynb`. In addition to neural activity, you can also see blood flow in the movie."
    ]
   },
@@ -141,7 +142,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Set up a cluster\n",
+    "# Set up a cluster\n",
     "To enable parallel computing we will set up a local cluster. The resulting variable `cluster` contains the pool of processors (CPUs) that will be used in later steps. If you use `dview=cluster` in later steps, then parallel processing will be used. If you use `dview=None` then no parallel processing will be used. The `num_processors_to_use` variable determines how many CPU dores you will use (when set to `None` it goes to the default of one less than the number available):"
    ]
   },
@@ -183,7 +184,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Set up some parameters\n",
+    "# Set up some parameters\n",
     "We first set some parameters related to the data and motion correction and create a `params` object. We'll modify this parameter object later on with settings for source extraction. You can also set all the parameters at once as demonstrated in the `demo_pipeline.ipynb` notebook.\n",
     "\n",
     "Note here we are setting `pw_rigid` to `False` as our data seems to mainly contain large-scale translational motion. We can always redo this later if it turns out to be a mistake."
@@ -231,7 +232,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Motion Correction\n",
+    "# Motion Correction\n",
     "The background signal in micro-endoscopic data is very strong and makes motion correction challenging. Hence, as a first step the algorithm performs a high pass spatial filtering with a Gaussian kernel to remove the bulk of the lower-frequency background activity and enhance spatial landmarks. The size of the kernel is given from the parameter `gSig_filt`. If this is left to the default value of `None` then no spatial filtering is performed (default option, used in 2p data for CNMF). \n",
     "\n",
     "After spatial filtering, the NoRMCorre algorithm is used to determine the motion in each frame. The inferred motion is then applied to the *original* data, so no information is lost before source separation. The motion corrected files are saved in memory mapped format. If no motion correction is performed (i.e., `motion_correct` was set to `False`), then the file gets directly memory mapped.\n",
@@ -277,7 +278,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Compare original and motion corrected movie. Note they look pretty similar, as there wasn't much motion to begin with. You can see from the shift plot (plotted above) that the extracted shifts were all very small."
+    "Compare original (left) and motion corrected movie (right).\n",
+    "\n",
+    "You will probably notice they look pretty similar, as there wasn't much motion to begin with. You can see from the shift plot (plotted above) that the extracted shifts were all very small."
    ]
   },
   {
@@ -318,12 +321,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Parameter setting for CNMF-E\n",
-    "Everything is now set up to run source extraction with CNMFE.  We first will define some parameters. We will construct a new dictionary and use this to modify the *existing* `parameters` object, using the `change_params()` method.\n",
+    "# Parameter setting for CNMF-E\n",
+    "Everything is now set up to run source extraction with CNMFE. We will construct a new parameter dictionary and use this to modify the *existing* `parameters` object, using the `change_params()` method.\n",
     "\n",
-    "You will likely notice a few differences from the CNMF use case. We will explain the important parameters below. For now, note that we have set `gnb` to `0`: this is effectively the flag telling Caiman to use CNMFE instead of CNMF. \n",
+    "There are *two* main differences between the CNMF and CNMFE source separation algorithms. The first is the background model (this is discussed in the sidebar below on the Ring Model). The second difference is in how the models are initialized. This is addressed below when we go over setting corr/pnr thresholds for initialization, which we did not have to do for our 2p data.\n",
     "\n",
-    "There are *two* main differences between the CNMF and CNMFE source separation algorithms. The first is the background model (this is discussed in the sidebar below on the Ring Model). The second difference is in how the models are initialized. This is addressed below when we go over setting corr/pnr thresholds for initialization, which we did not have to do for our 2p data."
+    "We will explain the important differences in more detail below. For now, note that we have set `gnb` to `0`: this is effectively the flag telling Caiman to use CNMFE instead of CNMF. "
    ]
   },
   {
@@ -336,7 +339,7 @@
     "p = 1               # order of the autoregressive system\n",
     "K = None            # upper bound on number of components per patch, in general None for CNMFE\n",
     "gSig = np.array([3, 3])  # expected half-width of neurons in pixels \n",
-    "gSiz = 4*gSig + 1     # half-width of bounding box created around neurons during initialization\n",
+    "gSiz = 2*gSig + 1     # half-width of bounding box created around neurons during initialization\n",
     "merge_thr = .7      # merging threshold, max correlation allowed\n",
     "rf = 40             # half-size of the patches in pixels. e.g., if rf=40, patches are 80x80\n",
     "stride_cnmf = 20    # amount of overlap between the patches in pixels \n",
@@ -393,7 +396,7 @@
    "source": [
     "<div class=\"alert alert-info\">\n",
     "    <h2 >CNMF-E: The Ring Model</h2>\n",
-    "   Background activity is very ill-behaved with 1p recordings: it often fluctuates locally and is much larger in magnitude than the neural signals we want to extract. In other words, the large-scale background model used for CNMF is not sufficient for most 1p data. Hence, Pengcheng Zhou and others came up with a localized model of background activity for CNMFE: CNMFE represents the background at each pixel as the weighted sum of activity from a circle (or ring) of pixels a certain distance from that pixel. The distance of this ring from the reference pixel is set by the <em>ring_size_factor</em> parameter. This more complex pixel-wise background model explains why CNMFE is computationally more expensive than CNMF, and also why it works so well to mop up background noise to find the neurons in your 1p data. \n",
+    "   Background activity is very ill-behaved with 1p recordings: it often fluctuates locally and is much larger in magnitude than the neural signals we want to extract. In other words, the large-scale background model used for CNMF is not sufficient for most 1p data. Hence, Pengcheng Zhou and others came up with a localized model of background activity for CNMFE: CNMFE represents the background at each pixel as the weighted sum of activity from a circle (or ring) of pixels a certain distance from that pixel. The distance of this ring from the reference pixel is set by the <em>ring_size_factor</em> parameter. This more complex pixel-wise background model explains why CNMFE is computationally more expensive than CNMF, and also why it works better to mop up large-scale localized background noise to find the neurons in your 1p data. \n",
     "    \n",
     "When you set <em>gnb</em> in the CNMF model (usually to 1 or 2), you are setting the number of global background components to use. The fact that you can get away with so few is testament to how well-behaved the background activity is in 2p recordings compared to 1p. When we set <em>gnb</em> to 0 in Caiman, this is a flag telling Caiman's back end to switch to the ring model of the background activity. \n",
     "\n",
@@ -420,16 +423,18 @@
     "> `stride_cnmf` is the overlap between patches in pixels (the actual overlap is `stride_cnmf + 1`). This should be at least the diameter of a neuron. The larger the overlap, the greater the computational load, but the results will be more accurate when stitching together results from different patches. This param should probably have been called 'overlap' instead of 'stride'.\n",
     "\n",
     "`gSig (int, int)`: *half-width of neurons*\n",
-    "> `gSig` is roughly the half-width of neurons in your movie in pixels (height, width). It is related to the `gSiz` parameter, which is typically set to `4*gSig + 1` for CNMFE. `gSiz` is the size (in pixels) of a bounding box created around each seed pixel during initilialization.\n",
+    "> `gSig` is roughly the half-width of neurons in your movie in pixels (height, width). It is the standard deviation of the mean-centered Gaussian used to filter the movie before initialization for CNMFE. It is related to the `gSiz` parameter, which is the size (in pixels) of a bounding box created around each seed pixel during initilialization. You will usually set `gSiz` to between `2*gSig` and `4*gSig` for CNMFE. \n",
     "\n",
     "`merge_thr (float)`: *merge threshold* \n",
     "> If the correlation between two spatially overlapping components is above `merge_thr`, they will be merged into one component. \n",
     "\n",
     "`min_corr` (float): *minimum correlation*\n",
-    "> Set a threshold correlation. We discuss this below.\n",
+    "> Pixels from neurons tend to be correlated with their neighbors. For initialization we select for pixels above a minimum correlation `min_corr`.  We discuss this more below.\n",
     "\n",
     "`min_pnr` (float): *minimum peak to noise ratio*\n",
-    "> Set a threshoild peak-to-noise ratio.  We discuss this below."
+    "> Set a threshoild peak-to-noise ratio. Pixels from neurons tend to have a high signal-to-noise ratio. For initialization we select for pixels above a minimum peak-to-noise-ratio `min_pnr`. We discuss this more below.\n",
+    "\n",
+    "As we did in `demo_pipeline.ipynb`, let's define a convenience function to get these key params for cnmfe so we can print them as we iteratively muck about in paramter space:"
    ]
   },
   {
@@ -472,10 +477,10 @@
    "metadata": {},
    "source": [
     "## Inspect summary images and set parameters\n",
-    "### correlation/pnr\n",
-    "For CNMFE, Caiman uses the correlation and peak-to-noise (PNR) ratio maps for initialization, which will both tend to be high in regions that contain neurons . Hence, we set a threshold for both quantitites to remove the low correlation/low pnr regions, and highlight the regions higher in both metrics, most likely to contain neuronal activity. \n",
+    "### Correlation-pnr plot\n",
+    "For CNMFE, Caiman uses the correlation and peak-to-noise (PNR) ratio for initialization, which will both tend to be high in regions that contain neurons. Hence, we set a threshold for both quantitites to remove the low correlation/low pnr regions, and highlight the regions higher in both metrics, those regions most likely to contain neuronal activity. \n",
     "\n",
-    "First, we calculate the correlation and pnr maps of the raw motion corrected movie after filtering with a mean-centered Gaussian (for more information, see the sidebar below). These calculation can be computationally and memory demanding for large datasets. Hence, you can compute on a subset of the data for long movies, and the results will not change by changing `images[::1]` to `images[::5]` or something similar, which is what we do in the following:"
+    "First, we calculate the correlation and pnr maps of the raw motion corrected movie after filtering with a mean-centered Gaussian with standard deviation `gSig` (for more information, see the sidebar below). These calculation can be computationally and memory demanding for large datasets, so we subsample if there are many thousands of frames:"
    ]
   },
   {
@@ -486,7 +491,7 @@
    "source": [
     "print(gSig)\n",
     "gsig_tmp = (3,3)\n",
-    "correlation_image, peak_to_noise_ratio = cm.summary_images.correlation_pnr(images[::max(T//1000, 1)], # take all data, or subsample if more than 1000 frames\n",
+    "correlation_image, peak_to_noise_ratio = cm.summary_images.correlation_pnr(images[::max(T//1000, 1)], # subsample if needed\n",
     "                                                                           gSig=gsig_tmp[0], # used for filter\n",
     "                                                                           swap_dim=False) # change swap dim if output looks weird, it is a problem with tiffile"
    ]
@@ -496,9 +501,7 @@
    "metadata": {},
    "source": [
     "<img src=\"../../docs/img/bokeh_menu.jpg\" align=\"right\" width=\"200\"></img>\n",
-    "Using `nb_inspect_correlation_pnr()`, you can inspect the correlation and PNR images to find reasonable threshold values for `min_corr` and `min_pnr`. You can adjust the dynamic range in the plots shown below by choosing the Y-box select tool (third button from the left) and selecting the desired region in the histogram plots to the right of each image.\n",
-    "\n",
-    "We are looking for a couple of things in the following plot: first, don't pick a `gSig` value so large that we are clearly blending neurons together (you can try setting `gsig_tmp` to `(6,6)` to see what that is like). Second, select a range of corr and pnr values so that the *lower* threshold eliminates most of the noise and blood vessels, letting through as much of the neural elements as possible. "
+    "Using `nb_inspect_correlation_pnr()`, you can inspect the correlation and PNR images to find reasonable threshold values for `min_corr` and `min_pnr`. You can adjust the range of values displayed in the plots shown below by choosing the Y-box select tool (third button from the left -- highlighted in yellow in the accompanying image) and selecting the desired region in the histograms to the right of each image. You can also use the pan button (first button on the left) to zoom/adjust the axis limits in the histogram to make it easier to see the limits."
    ]
   },
   {
@@ -507,7 +510,18 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "nb_inspect_correlation_pnr(correlation_image, peak_to_noise_ratio, cmap='inferno') # jet, fire alternative maps"
+    "nb_inspect_correlation_pnr(correlation_image, peak_to_noise_ratio, cmap='inferno') # jet, fire are also good cmaps"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We are looking for a couple of things in the above plot:\n",
+    "1) Did we filter with a `gSig` value small enough so that we aren't blending different neurons together? To see what it is like when this happens, set `gsig_tmp` to `(6,6)` and inspect the above plots. \n",
+    "2) More importantly, we want to find the threshold correlation and pnr values so that the *lower* threshold eliminates most of the noise and blood vessels from the plots, leaving behind as much of the neural elements as possible. For this data it will be around a correlation value lower bound somewhere between 0.8 and 0.9, and and pnr lower bound somewhere between 10 and 20 (as with CNMF, there is no perfect value: it is often an iterative search, but keep in mind it is better to have false positives later than false negatives).\n",
+    "\n",
+    "Use your judgment, and if you want to change the parameters you can do so here (here are some values that seem reasonable):"
    ]
   },
   {
@@ -516,17 +530,17 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# ChANGE Values!! gsig is good. let's change corr and pnr, as discussed more extensively in demo_notebook.ipynb. \n",
-    "\n",
     "print(key_params_cnmfe(cnmfe_model))\n",
     "\n",
-    "min_corr_new  = 0.85  # 0.7 0.4 # 0.66 #0.74\n",
-    "min_pnr_new = 12      #3  #2 #4.3 # 8.5\n",
+    "gsig_new = gSig # unchanged\n",
+    "min_corr_new  = 0.85 \n",
+    "min_pnr_new = 12     \n",
     "\n",
-    "cnmfe_model.params.change_params(params_dict={'min_corr': min_corr_new, \n",
+    "cnmfe_model.params.change_params(params_dict={'gSig': gsig_new,\n",
+    "                                              'min_corr': min_corr_new, \n",
     "                                              'min_pnr': min_pnr_new});\n",
     "\n",
-    "print(key_params_cnmfe(cnmfe_model))\n"
+    "print(key_params_cnmfe(cnmfe_model))"
    ]
   },
   {
@@ -541,11 +555,11 @@
    "metadata": {},
    "source": [
     "<div class=\"alert alert-info\">\n",
-    "    <h2>CNMFE initialization: More on the correlation and peak-to-noise-ratio</h2>\n",
+    "    <h2>CNMFE initialization: More on correlation and peak-to-noise-ratio</h2>\n",
     "     <img src=\"../../docs/img/mn_centered_gaussian.jpg\" align=\"right\" width=\"200\"></img>\n",
-    "How are correlation and peak-to-noise ratio calculated? First Caiman convolves the motion corrected movie with a <i>mean-centered Gaussian</i> (example to the right). The sigma of the Gaussian is <em>gSig</em>, and mean centering is turned on by setting <em>center_psf</em> to <em>True</em>. This mean centering creates a Gaussian with a positive peak in the middle of width <i>approximately</i> <em>gSig/2</em>, surrounded by a negative trench, and sets the outer edge to be zero. This preprocessing filter serves to highlight neurons and smooth away low-frequency background components.\n",
+    "How are correlation and peak-to-noise ratio actually calculated? First Caiman convolves the motion corrected movie with a <i>mean-centered Gaussian</i> (example to the right). The sigma of the Gaussian is <em>gSig</em>, and mean centering is turned on by setting <em>center_psf</em> to <em>True</em>. This mean centering creates a Gaussian with a positive peak in the middle of width <i>approximately</i> <em>gSig/2</em>, surrounded by a negative trench, and sets the outer edge to be zero. This preprocessing filter serves to highlight neuronal peaks and smooth away low-frequency background components.\n",
     "\n",
-    "The function <em>correlation_pnr()</em> applies this mean-centered Gaussian to each frame of the motion corrected movie, and returns the correlation image of that movie, as well as the peak-to-noise-ratio (PNR). The correlation image is the correlation of each pixel with its neighbors. The PNR is the ratio of the maximum magnitude at a pixel to the noise value at that pixel (it is a fast and rough measure of signal-to-noise).  As mentioned above, both of these values tend to be higher in actual neurons, and the CNMFE initialization procedure is to set a threshold for both quantities, take their <i>product</i>, and use the peaks in this product map to find <i>seed pixels</i> for initialization of the CNMFE source separation algorithm.\n",
+    "The function <em>correlation_pnr()</em> applies this mean-centered Gaussian to each frame of the motion corrected movie and returns the correlation image of that movie, as well as the peak-to-noise-ratio (PNR). The correlation image is the correlation of each pixel with its neighbors. The PNR is the ratio of the maximum magnitude at a pixel to the noise value at that pixel (it is a fast and rough measure of signal-to-noise). As mentioned above, both of these values tend to be higher in actual neurons, and the CNMFE initialization procedure is to set a threshold for both quantities, take their <i>product</i>, and use the peaks in this product map to find <i>seed pixels</i> for initialization of the CNMFE source separation algorithm.\n",
     "\n",
     "More details on the initialization procedure used here can be found in the <a href=\"https://elifesciences.org/articles/28728\">CNMFE paper</a>, or just by exploring the code.         \n",
     "</div>"
@@ -556,25 +570,7 @@
    "metadata": {},
    "source": [
     "### Quilt plot for spatial parameters\n",
-    "Blah blah blah"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "key_params_cnmfe(cnmfe_model)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "view_quilt?"
+    "As discussed in `demo_pipeline.ipynb`, the other important paramters are those used for dividing the movie into patches for parallelization of source separation. The same processe is used for CNMFE. You want to be sure to pick `rf` and `stride` parameters so that many neurons fit in each patch, and at least one neuron fits in the overlap region between patches. You can visualize the patches using the `view_quilt()` function:"
    ]
   },
   {
@@ -604,39 +600,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "These patches and overlaps are on the *large* side, but that is ok for now. The main concern is making them too small, so let's just leave them be. The demo_notebook.ipynb goes through in some detail adjusting the spatial parameters."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "rf, stride_cnmf, gSig, gSiz, K, merge_thr"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "rf_new = rf  # unchanged\n",
-    "stride_new = stride_cnmf\n",
-    "gsig_new = gSig # unchanged\n",
-    "gsiz_new = (7,7) #gsig_new*4 + 1\n",
-    "k_new = K  # unchanged\n",
-    "merge_thr_new = merge_thr  # unchanged\n",
+    "These patches and overlaps are on the large side, but that is ok. The main concern would be making them too small, so let's just leave them be. The `demo_notebook.ipynb` goes through in some detail adjusting the spatial parameters, as we did above for the initialization params. The process would be the same here if you needed to change the patch parameters for your data.\n",
     "\n",
-    "print(f\"Before changing: {key_params_cnmfe(cnmfe_model)}\")\n",
-    "cnmfe_new_params = {'rf': rf_new,\n",
-    "                   'stride': stride_new,\n",
-    "                   'gSig': gsig_new,\n",
-    "                   'gSiz': gsiz_new, \n",
-    "                   'K': k_new,\n",
-    "                   'merge_thr': merge_thr_new}\n",
-    "cnmfe_model.params.change_params(params_dict=cnmfe_new_params)"
+    "Now that we are happy with our parameters, let's run the algorithm."
    ]
   },
   {
@@ -660,7 +626,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Running the algorithm creates an `estimates` class, which we discuss in detail in `demo_pipeline.ipynb`. The CNMFE `estimates` class includes almost all the same attributes as with CNMF, including the neural spatial and temporal components `A` and `C`. It also includes the discovered model of background activity, which in this case is different from the CNMF model. For CNMF the background model is returned as low-rank matrices `b` and `f`. For CNMFE, the background model parameters are represented in the matrix `W` (the weights of the ring model for each pixel) as well as `b0` (the constant offset for each pixel). "
+    "Running the algorithm creates an `estimates` class, which we discuss in detail in `demo_pipeline.ipynb`. The CNMFE `estimates` class includes almost all the same attributes as with CNMF, such as the neural spatial and temporal components `A` and `C`. \n",
+    "\n",
+    "It also includes the discovered model of background activity, which in this case is different from the CNMF model. For CNMF the background model is returned as low-rank matrices `b` and `f`. For CNMFE, the background model parameters are represented in the matrix `W` (the weights of the *ring model* for each pixel) as well as `b0` (the constant offset for each pixel). We will show how to reconstruct the background activity below. "
    ]
   },
   {
@@ -684,10 +652,12 @@
     "# Component Evaluation\n",
     "Source extraction typically produces many false positives. Our next step is quality control: separating the results into \"good\" and \"bad\" neurons using two different metrics (discussed in detail in `demo_notebook.ipynb`):\n",
     "\n",
-    "- **signal-to-noise ratio**: a minimum SNR is set during calcium transients (`min_SNR`).\n",
-    "- **spatial correlation**:  a minimum correlation is set beteween the shape of each component and the frames in the movie when that component is active (`rval_thr`). \n",
+    "- **Signal-to-noise ratio (SNR)**: a minimum SNR is set for the calcium transients (`min_SNR`).\n",
+    "- **Spatial correlation**:  a minimum correlation is set between the shape of each component and the frames in the movie when that component is active (`rval_thr`). \n",
+    "\n",
+    "> Caiman does *not* use the CNN classifier to sort neurons based on shape for 1p data: the network was trained on 2p data. Hence, we set the `use_cnn` param to `False`. \n",
     "\n",
-    "> Caiman does *not* use the CNN classifier to sort neurons based on shape for 1p data: the network was trained on 2p data. Hence, we set the `use_cnn` param to `False`. "
+    "Here we set the two parameters and run `evaluate_components()` to see which pass muster:"
    ]
   },
   {
@@ -712,13 +682,6 @@
     "print(f\"Number rejected: {len(cnmfe_model.estimates.idx_components_bad)}\")"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -729,21 +692,21 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "scrolled": false
-   },
+   "metadata": {},
    "outputs": [],
    "source": [
     "#%% plot contour plots of accepted and rejected components\n",
     "cnmfe_model.estimates.plot_contours_nb(img=correlation_image, \n",
-    "                               idx=cnmfe_model.estimates.idx_components)"
+    "                                       idx=cnmfe_model.estimates.idx_components);"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "View traces of accepted and rejected components."
+    "These components look reasonable, if a bit large -- their centers are reasonable but the spatial footprints are quite spread out. If I were persuing this further, it would likely be helpful to re-run CNMFE reducing `gSiz` a bit, which can influence the overall \"spread\" of the neurons in space.\n",
+    "\n",
+    "View traces of accepted and rejected components:"
    ]
   },
   {
@@ -755,7 +718,7 @@
     "cnmfe_model.estimates.nb_view_components(img=correlation_image, \n",
     "                                        idx=cnmfe_model.estimates.idx_components,\n",
     "                                        cmap='viridis', #gray\n",
-    "                                        thr=0.8); #increase to see full footprint"
+    "                                        thr=.9); #increase to see full footprint"
    ]
   },
   {
@@ -769,7 +732,7 @@
     "                                        idx=cnmfe_model.estimates.idx_components_bad,\n",
     "                                        cmap='viridis', #gray\n",
     "                                        denoised_color='red',\n",
-    "                                        thr=0.5); #increase to see full footprint"
+    "                                        thr=0.8); #increase to see full footprint"
    ]
   },
   {
@@ -821,7 +784,12 @@
    "metadata": {},
    "source": [
     "# A few loose ends\n",
-    "We have extracted the calcium traces C, spatial footprints A, and estimated spike counts S, which is the main goal with CNMF. But there are a few important things remaining."
+    "We have extracted the calcium traces C, spatial footprints A, and estimated spike counts S, which is the main goal with CNMF. But there are a few important things remaining. \n",
+    "\n",
+    "##  Deconvolution for 1p?\n",
+    "While we haven't discussed deconvolution (the estimation of the spikes that generated the calcium traces in `C`), we suggest treating the spike counts returned for 1p data (in `estimates.S`) with CNMFE with some caution. Currently (as of Fall 2023) there is no ground-truth data that we are aware of that lets us directly compare the output of 1p recordings with actual spiking data. There is a *lot* of such data for 2p data, which allows for great comparison of different method accuracy (for instance see [the Spikefinder paper](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006157)). There are many labs and companies that are likely able to do these difficult and important experiments, and we would welcome this collaboration. Please let us know if you are interested!\n",
+    "\n",
+    "Because of this, when doing analysis, most researchers analyze the calcium traces directly for 1p recordings (the data in `estimates.C`) or a normalized version of them."
    ]
   },
   {
@@ -829,7 +797,7 @@
    "metadata": {},
    "source": [
     "## Extract $\\Delta F/F$ values\n",
-    "Note currently in Caiman, we don't return a true dfof value for 1p data because we normalize to both the baseline fluorescence and background activity, and the background model in 1p is so complex. This idiosyncrasy is likely to change soon, but we currently only *detrend* the data but do not normalize:"
+    "Currently in Caiman, we don't return a true dfof value for 1p data because Caiman normalizes to both the baseline fluorescence and background activity, and the background activity in 1p is so ill-behaved (as discussed above in the sidebar on the ring model). This idiosyncrasy of Caiman is likely to change soon, but we currently only *detrend* the data but do not normalize to baseline (which explains the warning you will see when you run the following):"
    ]
   },
   {
@@ -852,9 +820,11 @@
    "metadata": {},
    "source": [
     "## View some different movie results\n",
-    "Something something Wb as background model. \n",
+    "As with CNMF, the CNMFE model of the original movie is:\n",
     "\n",
-    "Play the reconstructed movie alongside the original movie and the (amplified) residual"
+    "    original_movie = neural_activity + background + residual\n",
+    "    \n",
+    "The only real difference between CNMF and CNMFE is the model of the background. We can reconstruct the neural movie as `AC` just as we did in `demo_pipeline.ipynb`. Unfortunately, reconstructing the background activity via the ring model is much more complicated for CNMFE, so we will just punt to a built-in function for that in what follows (`compute_background()`)."
    ]
   },
   {
@@ -868,6 +838,13 @@
     "images = np.reshape(Yr.T, [num_frames] + list(dims), order='F')"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Get model of neural activity and background activity (note for the neural model we are just including the accepted components):"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -877,7 +854,7 @@
     "neural_model = cnmfe_model.estimates.A[:, cnmfe_model.estimates.idx_components] @\\\n",
     "               cnmfe_model.estimates.C[cnmfe_model.estimates.idx_components, :]  # AC\n",
     "neural_movie = cm.movie(neural_model).reshape(dims + (-1,), order='F').transpose([2, 0, 1])\n",
-    "background_model = cnmfe_model.estimates.compute_background(Yr);\n",
+    "background_model = cnmfe_model.estimates.compute_background(Yr);  # build in function -- explore source code for details\n",
     "bg_movie = cm.movie(background_model).reshape(dims + (-1,), order='F').transpose([2, 0, 1])"
    ]
   },
@@ -885,7 +862,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To view just the background model as a movie:"
+    "To view just the background model (you will see lots of parts that are constant such as blood vessels, lots of large-scale background flourescence, and some local activity which is is on spatial scales larger than `gSig`):"
    ]
   },
   {
@@ -908,7 +885,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To view just the pure neural activity model:"
+    "To view just the pure neural activity model."
    ]
   },
   {
@@ -928,12 +905,10 @@
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "cnmfe_model.estimates.play_movie?"
+    "You can also view the original movie, neural movie, and the residual (the first two have background removed). Note the residual in this case includes quite a bit of neural-looking activity. This is often near blood vessels and edges of the movie, where the preprocessing/initialization can miss neural activity: you can try playing with different params to get them (which would you try?). The [Mesmerize](https://github.com/nel-lab/mesmerize-core) package is a great way to search parameter space and visualize results with more sophisticated visualization tools if you have a tricky data set."
    ]
   },
   {
@@ -942,28 +917,20 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# with background \n",
+    "# without background\n",
     "cnmfe_model.estimates.play_movie(images, \n",
-    "                                 q_max=99.5, \n",
+    "                                 q_max=99.9, \n",
     "                                 magnification=2,\n",
-    "                                 include_bck=True, \n",
-    "                                 gain_res=10, \n",
-    "                                 bpx=bord_px);"
+    "                                 include_bck=False,\n",
+    "                                 gain_res=5,\n",
+    "                                 use_color=True);"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# without background\n",
-    "cnmfe_model.estimates.play_movie(images, \n",
-    "                         q_max=99.9, \n",
-    "                         magnification=2,\n",
-    "                         include_bck=False,\n",
-    "                         gain_res=4,\n",
-    "                         bpx=bord_px);"
+    "Note the middle panel of neural activity includes *all* components (accepted and rejected), so you will see some of the blood vessel \"activity\" that was discovered and rejected:"
    ]
   },
   {