From d106face0d7cfdc8a601d59932df967e82eba579 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Mon, 25 Dec 2023 20:45:39 +0000
Subject: [PATCH 01/26] planned changelog.

---
 docs/changelog.md | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/docs/changelog.md b/docs/changelog.md
index af447b88..a752f613 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -4,6 +4,10 @@
 
 ## v0.2.1
 
+- Removed tutorials
+- Updated base (mock) dataset
+- Switched library from base unit of `klambda` to `lambda`
+- Implemented type checking for more of the codebase
 - Manually line wrapped many docstrings to conform to 88 characters per line or less. Ian thought `black` would do this by default, but actually that [doesn't seem to be the case](https://github.com/psf/black/issues/2865).
 - Fully leaned into the `pyproject.toml` setup to modernize build via [hatch](https://github.com/pypa/hatch). This centralizes the project dependencies and derives package versioning directly from git tags. Intermediate packages built from commits after the latest tag (e.g., `0.2.0`) will have an extra long string, e.g., `0.2.1.dev178+g16cfc3e.d20231223` where the version is a guess at the next version and the hash gives reference to the commit. This means that developers bump versions entirely by tagging a new version with git (or more likely by drafting a new release on the [GitHub release page](https://github.com/MPoL-dev/MPoL/releases)).
 - Removed `setup.py`.

From 8720653ccf60667123e5766f5655dc9d9c14b61b Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Mon, 25 Dec 2023 23:41:50 +0000
Subject: [PATCH 02/26] simplify changelog.

---
 docs/changelog.md | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/docs/changelog.md b/docs/changelog.md
index a752f613..bbfcfef1 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -4,9 +4,7 @@
 
 ## v0.2.1
 
-- Removed tutorials
-- Updated base (mock) dataset
-- Switched library from base unit of `klambda` to `lambda`
+- *Placeholder* Planned changes described by Architecture GitHub Project.
 - Implemented type checking for more of the codebase
 - Manually line wrapped many docstrings to conform to 88 characters per line or less. Ian thought `black` would do this by default, but actually that [doesn't seem to be the case](https://github.com/psf/black/issues/2865).
 - Fully leaned into the `pyproject.toml` setup to modernize build via [hatch](https://github.com/pypa/hatch). This centralizes the project dependencies and derives package versioning directly from git tags. Intermediate packages built from commits after the latest tag (e.g., `0.2.0`) will have an extra long string, e.g., `0.2.1.dev178+g16cfc3e.d20231223` where the version is a guess at the next version and the hash gives reference to the commit. This means that developers bump versions entirely by tagging a new version with git (or more likely by drafting a new release on the [GitHub release page](https://github.com/MPoL-dev/MPoL/releases)).

From 1aca8e83a4c65c012eb1f1a0878a898eb4e16206 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Mon, 25 Dec 2023 23:59:26 +0000
Subject: [PATCH 03/26] updated developer doc w/ typing references.

---
 docs/developer-documentation.md | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/docs/developer-documentation.md b/docs/developer-documentation.md
index 8c1d6a34..af81ed0a 100644
--- a/docs/developer-documentation.md
+++ b/docs/developer-documentation.md
@@ -168,7 +168,7 @@ In general, we envision the contribution lifecycle following a pattern:
 8. After the tests have completed successfully, use the Github interface to initiate a pull request back to the central repository. If you know that your feature branch isn't ready to be merged, but would still like feedback on your work, please submit a [draft or "work in progress"](https://github.blog/2019-02-14-introducing-draft-pull-requests/) pull request.
 9. Someone will review your pull request and may suggest additional changes for improvements. If approved, your pull request will be merged into the MPoL-dev/MPoL repo. Thank you for your contribution!
 
-### Contributing code and tests
+### Tests
 
 We strive to release a useable, stable software package. One way to help accomplish this is through writing rigorous and complete tests, especially after adding new functionality to the package. MPoL tests are located within the `test/` directory and follow [pytest](https://docs.pytest.org/en/6.2.x/contents.html#toc) conventions. Please add your new tests to this directory---we love new and useful tests.
 
@@ -176,7 +176,17 @@ If you are adding new code functionality to the package, please make sure you ha
 
 For MPoL maintainers, see instructions [here](https://github.com/MPoL-dev/MPoL/wiki/Releasing-a-new-version-of-MPoL) for releasing a new version of MPoL.
 
-### Contributing documentation
+### Typing
+
+Core MPoL routines are type-checked with [mypy](https://mypy.readthedocs.io/en/stable/index.html) for 100% coverage. Before you push your changes to the repo, you will want to make sure your code passes type checking locally (otherwise they will fail the GitHub Actions continuous integration tests). You can do this from the root of the repo by 
+
+```
+mypy src/mpol --pretty
+```
+
+If you are unfamiliar with typing in Python, we recommend reading the [mypy cheatsheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html) to get started.
+
+### Documentation
 
 A general workflow for writing documentation might look like
 

From a515ff61efa165140a463ca3df3812d1525e908a Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 00:21:09 +0000
Subject: [PATCH 04/26] hard wrapped math equations.

---
 src/mpol/constants.py |  10 +--
 src/mpol/losses.py    | 158 +++++++++++++++++++++---------------------
 2 files changed, 83 insertions(+), 85 deletions(-)

diff --git a/src/mpol/constants.py b/src/mpol/constants.py
index 5b0c6349..12e0f2f6 100644
--- a/src/mpol/constants.py
+++ b/src/mpol/constants.py
@@ -2,10 +2,10 @@
 from astropy.constants import c, k_B
 
 # convert from arcseconds to radians
-arcsec = np.pi / (180.0 * 3600)  # [radians]  = 1/206265 radian/arcsec
+arcsec: float = np.pi / (180.0 * 3600)  # [radians]  = 1/206265 radian/arcsec
 
-deg = np.pi / 180  # [radians]
+deg: float = np.pi / 180  # [radians]
 
-kB = k_B.cgs.value  # [erg K^-1] Boltzmann constant
-cc = c.cgs.value # [cm s^-1]
-c_ms = c.value # [m s^-1]
+kB: float = k_B.cgs.value  # [erg K^-1] Boltzmann constant
+cc: float = c.cgs.value  # [cm s^-1]
+c_ms: float = c.value  # [m s^-1]
diff --git a/src/mpol/losses.py b/src/mpol/losses.py
index 7add17e0..e3f80fa7 100644
--- a/src/mpol/losses.py
+++ b/src/mpol/losses.py
@@ -11,9 +11,6 @@
 import numpy as np
 import torch
 
-from . import datasets
-from .constants import *
-
 
 def chi_squared(model_vis, data_vis, weight):
     r"""
@@ -22,15 +19,16 @@ def chi_squared(model_vis, data_vis, weight):
 
     .. math::
 
-        \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) = \sum_i^N \frac{|V_i - M(u_i, v_i |\,\boldsymbol{\theta})|^2}{\sigma_i^2}
+        \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) = 
+        \sum_i^N \frac{|V_i - M(u_i, v_i |\,\boldsymbol{\theta})|^2}{\sigma_i^2}
 
-    where :math:`\sigma_i^2 = 1/w_i`. The sum is over all of the provided visibilities. 
-    This function is agnostic as to whether the sum should include the Hermitian 
-    conjugate visibilities, but be aware that the answer returned will be different 
+    where :math:`\sigma_i^2 = 1/w_i`. The sum is over all of the provided visibilities.
+    This function is agnostic as to whether the sum should include the Hermitian
+    conjugate visibilities, but be aware that the answer returned will be different
     between the two cases. We recommend not including the Hermitian conjugates.
 
     Args:
-        model_vis (PyTorch complex): array tuple of the model representing 
+        model_vis (PyTorch complex): array tuple of the model representing
             :math:`\boldsymbol{V}`
         data_vis (PyTorch complex): array of the data values representing :math:`M`
         weight (PyTorch real): array of weight values representing :math:`w_i`
@@ -38,32 +36,30 @@ def chi_squared(model_vis, data_vis, weight):
     Returns:
         torch.double: the :math:`\chi^2` likelihood
     """
-    # print("inside chi_squared")
-    # print("model", model_vis.shape)
-    # print("data", data_vis.shape)
-    # print("weight", weight.shape)
 
     return torch.sum(weight * torch.abs(data_vis - model_vis) ** 2)
 
 
 def log_likelihood(model_vis, data_vis, weight):
     r"""
-    Compute the log likelihood function :math:`\ln\mathcal{L}` between the complex data 
+    Compute the log likelihood function :math:`\ln\mathcal{L}` between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
 
-        \ln \mathcal{L}(\boldsymbol{V}|\,\boldsymbol{\theta}) = - \left ( N \ln 2 \pi +  \sum_i^N \sigma_i^2 + \frac{1}{2} \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) \right )
+        \ln \mathcal{L}(\boldsymbol{V}|\,\boldsymbol{\theta}) = 
+        - \left ( N \ln 2 \pi +  \sum_i^N \sigma_i^2 + 
+        \frac{1}{2} \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) \right )
 
     where :math:`\chi^2` is evaluated using :func:`mpol.losses.chi_squared`.
 
-    This function is agnostic as to whether the sum should include the Hermitian 
-    conjugate visibilities, but be aware that the normalization of the answer returned 
-    will be different between the two cases. Inference of the parameter values should 
+    This function is agnostic as to whether the sum should include the Hermitian
+    conjugate visibilities, but be aware that the normalization of the answer returned
+    will be different between the two cases. Inference of the parameter values should
     be unaffected. We recommend not including the Hermitian conjugates.
 
     Args:
-        model_vis (PyTorch complex): array tuple of the model representing 
+        model_vis (PyTorch complex): array tuple of the model representing
             :math:`\boldsymbol{V}`
         data_vis (PyTorch complex): array of the data values representing :math:`M`
         weight (PyTorch real): array of weight values representing :math:`w_i`
@@ -86,33 +82,33 @@ def log_likelihood(model_vis, data_vis, weight):
 
 def nll(model_vis, data_vis, weight):
     r"""
-    Calculate a normalized "negative log likelihood" loss between the complex data 
+    Calculate a normalized "negative log likelihood" loss between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
 
         L_\mathrm{nll} = \frac{1}{2 N} \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta})
 
-    where :math:`\chi^2` is evaluated using :func:`mpol.losses.chi_squared`. 
-    Visibilities may be any shape as long as all quantities have the same shape. 
-    Following `EHT-IV 2019 
+    where :math:`\chi^2` is evaluated using :func:`mpol.losses.chi_squared`.
+    Visibilities may be any shape as long as all quantities have the same shape.
+    Following `EHT-IV 2019
     <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_, we apply
-    a prefactor :math:`1/(2 N)`, where :math:`N` is the number of visibilities. The 
-    factor of 2 comes in because we must count real and imaginaries in the 
-    :math:`\chi^2` sum. This means that this normalized negative log likelihood loss 
-    function will have a minimum value of $L_\mathrm{nll}(\hat{\boldsymbol{\theta}}) 
-    \approx 1$ for a well-fit model (regardless of the number of data points), making 
-    it easier to set the prefactor strengths of other regularizers *relative* to this 
+    a prefactor :math:`1/(2 N)`, where :math:`N` is the number of visibilities. The
+    factor of 2 comes in because we must count real and imaginaries in the
+    :math:`\chi^2` sum. This means that this normalized negative log likelihood loss
+    function will have a minimum value of $L_\mathrm{nll}(\hat{\boldsymbol{\theta}})
+    \approx 1$ for a well-fit model (regardless of the number of data points), making
+    it easier to set the prefactor strengths of other regularizers *relative* to this
     value.
 
-    Note that this function should only be used in an optimization or point estimate 
-    situation. If it is used in any situation where uncertainties on parameter values 
-    are determined (such as Markov Chain Monte Carlo), it will return the wrong answer. 
-    This is because the relative scaling of :math:`L_\mathrm{nll}` with respect to 
+    Note that this function should only be used in an optimization or point estimate
+    situation. If it is used in any situation where uncertainties on parameter values
+    are determined (such as Markov Chain Monte Carlo), it will return the wrong answer.
+    This is because the relative scaling of :math:`L_\mathrm{nll}` with respect to
     parameter value is incorrect.
 
     Args:
-        model_vis (PyTorch complex): array tuple of the model representing 
+        model_vis (PyTorch complex): array tuple of the model representing
             :math:`\boldsymbol{V}`
         data_vis (PyTorch complex): array of the data values representing :math:`M`
         weight (PyTorch real): array of weight values representing :math:`w_i`
@@ -129,13 +125,13 @@ def nll(model_vis, data_vis, weight):
 
 def chi_squared_gridded(modelVisibilityCube, griddedDataset):
     r"""
-    Calculate the :math:`\chi^2` (corresponding to :func:`~mpol.losses.chi_squared`) 
+    Calculate the :math:`\chi^2` (corresponding to :func:`~mpol.losses.chi_squared`)
     using gridded data and model visibilities.
 
     Args:
-        modelVisibilityCube (torch complex tensor): torch tensor with shape 
-            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from 
-            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed," 
+        modelVisibilityCube (torch complex tensor): torch tensor with shape
+            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from
+            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed,"
             as in output from :meth:`mpol.fourier.FourierCube.forward()`.
         griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
 
@@ -147,7 +143,7 @@ def chi_squared_gridded(modelVisibilityCube, griddedDataset):
     # get the model_visibilities from the dataset
     # 1D torch tensor collapsed across cube dimensions, like
     # griddedDataset.vis_indexed and griddedDataset.weight_indexed
-    
+
     model_vis = griddedDataset(modelVisibilityCube)
 
     return chi_squared(
@@ -157,13 +153,13 @@ def chi_squared_gridded(modelVisibilityCube, griddedDataset):
 
 def log_likelihood_gridded(modelVisibilityCube, griddedDataset):
     r"""
-    Calculate the log likelihood function :math:`\ln\mathcal{L}` (corresponding to 
+    Calculate the log likelihood function :math:`\ln\mathcal{L}` (corresponding to
     :func:`~mpol.losses.log_likelihood`) using gridded data and model visibilities.
 
     Args:
-        modelVisibilityCube (torch complex tensor): torch tensor with shape 
-            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from 
-            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed," as 
+        modelVisibilityCube (torch complex tensor): torch tensor with shape
+            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from
+            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed," as
             in output from :meth:`mpol.fourier.FourierCube.forward()`.
         griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
 
@@ -184,14 +180,14 @@ def log_likelihood_gridded(modelVisibilityCube, griddedDataset):
 
 def nll_gridded(modelVisibilityCube, griddedDataset):
     r"""
-    Calculate a normalized "negative log likelihood" (corresponding to 
-    :func:`~mpol.losses.nll`) using gridded data and model visibilities. Function will 
+    Calculate a normalized "negative log likelihood" (corresponding to
+    :func:`~mpol.losses.nll`) using gridded data and model visibilities. Function will
     return the same value regardless of whether Hermitian pairs are included.
 
     Args:
-        vis (torch complex tensor): torch tensor with shape ``(nchan, npix, npix)`` to 
-            be indexed by the ``mask`` from :class:`~mpol.datasets.GriddedDataset`. 
-            Assumes tensor is "pre-packed," as in output from 
+        vis (torch complex tensor): torch tensor with shape ``(nchan, npix, npix)`` to
+            be indexed by the ``mask`` from :class:`~mpol.datasets.GriddedDataset`.
+            Assumes tensor is "pre-packed," as in output from
             :meth:`mpol.fourier.FourierCube.forward()`.
         griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
 
@@ -205,15 +201,15 @@ def nll_gridded(modelVisibilityCube, griddedDataset):
 
 def entropy(cube, prior_intensity, tot_flux=10):
     r"""
-    Calculate the entropy loss of a set of pixels following the definition in 
+    Calculate the entropy loss of a set of pixels following the definition in
     `EHT-IV 2019 <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_.
 
     Args:
-        cube (any tensor): pixel values must be positive :math:`I_i > 0` 
+        cube (any tensor): pixel values must be positive :math:`I_i > 0`
             for all :math:`i`
-        prior_intensity (any tensor): the prior value :math:`p` to calculate entropy 
+        prior_intensity (any tensor): the prior value :math:`p` to calculate entropy
             against. Could be a single constant or an array the same shape as image.
-        tot_flux (float): a fixed normalization factor; the user-defined target total 
+        tot_flux (float): a fixed normalization factor; the user-defined target total
             flux density
 
     Returns:
@@ -235,19 +231,19 @@ def entropy(cube, prior_intensity, tot_flux=10):
 
 def TV_image(sky_cube, epsilon=1e-10):
     r"""
-    Calculate the total variation (TV) loss in the image dimension (R.A. and DEC). 
-    Following the definition in `EHT-IV 2019 
-    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the 
+    Calculate the total variation (TV) loss in the image dimension (R.A. and DEC).
+    Following the definition in `EHT-IV 2019
+    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the
     image to be piecewise smooth and the gradient of the image to be sparse.
 
     Args:
-        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l` 
-            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and 
-            :math:`v` is the channel (velocity or frequency) dimension in 
+        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l`
+            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
+            :math:`v` is the channel (velocity or frequency) dimension in
             :math:`ndim=1`. Should be in sky format representation.
-        epsilon (float): a softening parameter in 
-            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any pixel-to-pixel variations 
-            within each image slice greater than this parameter will have a 
+        epsilon (float): a softening parameter in
+            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any pixel-to-pixel variations
+            within each image slice greater than this parameter will have a
             significant penalty.
 
     Returns:
@@ -255,7 +251,8 @@ def TV_image(sky_cube, epsilon=1e-10):
 
     .. math::
 
-        L = \sum_{l,m,v} \sqrt{(I_{l + 1, m, v} - I_{l,m,v})^2 + (I_{l, m+1, v} - I_{l, m, v})^2 + \epsilon}
+        L = \sum_{l,m,v} \sqrt{(I_{l + 1, m, v} - I_{l,m,v})^2 + 
+            (I_{l, m+1, v} - I_{l, m, v})^2 + \epsilon}
 
     """
 
@@ -272,14 +269,14 @@ def TV_image(sky_cube, epsilon=1e-10):
 
 def TV_channel(cube, epsilon=1e-10):
     r"""
-    Calculate the total variation (TV) loss in the channel dimension. Following the 
-    definition in `EHT-IV 2019 
+    Calculate the total variation (TV) loss in the channel dimension. Following the
+    definition in `EHT-IV 2019
     <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_.
 
     Args:
         cube (any 3D tensor): the image cube array :math:`I_{lmv}`
-        epsilon (float): a softening parameter in 
-            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any channel-to-channel pixel 
+        epsilon (float): a softening parameter in
+            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any channel-to-channel pixel
             variations greater than this parameter will have a significant penalty.
 
     Returns:
@@ -321,14 +318,14 @@ def edge_clamp(cube):
 
 def sparsity(cube, mask=None):
     r"""
-    Enforce a sparsity prior on the image cube using the :math:`L_1` norm. Optionally 
-    provide a boolean mask to apply the prior to only the ``True`` locations. For 
+    Enforce a sparsity prior on the image cube using the :math:`L_1` norm. Optionally
+    provide a boolean mask to apply the prior to only the ``True`` locations. For
     example, you might want this mask to be ``True`` for background regions.
 
     Args:
         cube (nchan, npix, npix): tensor image cube
-        mask (boolean): tensor array the same shape as ``cube``. The sparsity prior 
-            will be applied to those pixels where the mask is ``True``. Default is 
+        mask (boolean): tensor array the same shape as ``cube``. The sparsity prior
+            will be applied to those pixels where the mask is ``True``. Default is
             to apply prior to all pixels.
 
     Returns:
@@ -351,12 +348,12 @@ def sparsity(cube, mask=None):
 
 def UV_sparsity(vis, qs, q_max):
     r"""
-    Enforce a sparsity prior for all :math:`q = \sqrt{u^2 + v^2}` points larger than 
+    Enforce a sparsity prior for all :math:`q = \sqrt{u^2 + v^2}` points larger than
     :math:`q_\mathrm{max}`.
 
     Args:
         vis (torch.double) : visibility cube of (nchan, npix, npix//2 +1, 2)
-        qs: numpy array corresponding to visibility coordinates. Dimensionality of 
+        qs: numpy array corresponding to visibility coordinates. Dimensionality of
             (npix, npix//2)
         q_max (float): maximum radial baseline
 
@@ -382,7 +379,7 @@ def UV_sparsity(vis, qs, q_max):
 
 def PSD(qs, psd, l):
     r"""
-    Apply a loss function corresponding to the power spectral density using a Gaussian 
+    Apply a loss function corresponding to the power spectral density using a Gaussian
     process kernel.
 
     Assumes an image plane kernel of
@@ -426,17 +423,17 @@ def PSD(qs, psd, l):
 
 def TSV(sky_cube):
     r"""
-    Calculate the total square variation (TSV) loss in the image dimension 
-    (R.A. and DEC). Following the definition in `EHT-IV 2019 
-    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the 
-    image to be edge smoothed which may be a better reoresentation of the truth image 
+    Calculate the total square variation (TSV) loss in the image dimension
+    (R.A. and DEC). Following the definition in `EHT-IV 2019
+    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the
+    image to be edge smoothed which may be a better reoresentation of the truth image
     `K. Kuramochi et al 2018
     <https://ui.adsabs.harvard.edu/abs/2018ApJ...858...56K/abstract>`_.
 
     Args:
-        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l` 
-            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and 
-            :math:`v` is the channel (velocity or frequency) dimension in 
+        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l`
+            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
+            :math:`v` is the channel (velocity or frequency) dimension in
             :math:`ndim=1`. Should be in sky format representation.
 
     Returns:
@@ -444,7 +441,8 @@ def TSV(sky_cube):
 
     .. math::
 
-        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 + (I_{l, m+1, v} - I_{l, m, v})^2
+        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 + 
+        (I_{l, m+1, v} - I_{l, m, v})^2
 
     """
 

From 944c90db69e3b47530b4a5422e508dd351eb48b7 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 13:49:21 +0000
Subject: [PATCH 05/26] made changes to losses, moving to datasets.

---
 src/mpol/losses.py | 148 ++++++++++++++++++++++++++++-----------------
 1 file changed, 92 insertions(+), 56 deletions(-)

diff --git a/src/mpol/losses.py b/src/mpol/losses.py
index e3f80fa7..6817b95a 100644
--- a/src/mpol/losses.py
+++ b/src/mpol/losses.py
@@ -9,17 +9,23 @@
 """
 
 import numpy as np
+import numpy.typing as npt
 import torch
 
+from .datasets import GriddedDataset
 
-def chi_squared(model_vis, data_vis, weight):
-    r"""
-    Compute the :math:`\chi^2` between the complex data :math:`\boldsymbol{V}` and model
-      :math:`M` visibilities using
+
+def chi_squared(
+    model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
+) -> torch.Tensor:
+    r"""Computes :math:`\chi^2` between data and model.
+
+    The :math:`\chi^2` between the complex data :math:`\boldsymbol{V}` and model
+    :math:`M` visibilities is computed as
 
     .. math::
 
-        \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) = 
+        \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) =
         \sum_i^N \frac{|V_i - M(u_i, v_i |\,\boldsymbol{\theta})|^2}{\sigma_i^2}
 
     where :math:`\sigma_i^2 = 1/w_i`. The sum is over all of the provided visibilities.
@@ -27,28 +33,37 @@ def chi_squared(model_vis, data_vis, weight):
     conjugate visibilities, but be aware that the answer returned will be different
     between the two cases. We recommend not including the Hermitian conjugates.
 
-    Args:
-        model_vis (PyTorch complex): array tuple of the model representing
-            :math:`\boldsymbol{V}`
-        data_vis (PyTorch complex): array of the data values representing :math:`M`
-        weight (PyTorch real): array of weight values representing :math:`w_i`
-
-    Returns:
-        torch.double: the :math:`\chi^2` likelihood
+    Parameters
+    ----------
+    model_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the model values representing :math:`\boldsymbol{V}`
+    data_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the data values representing :math:`M`
+    weight : :class:`torch.Tensor` of :class:`torch.double`
+        array of weight values representing :math:`w_i`
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the :math:`\chi^2` likelihood, summed over all dimensions of input array.
     """
 
     return torch.sum(weight * torch.abs(data_vis - model_vis) ** 2)
 
 
-def log_likelihood(model_vis, data_vis, weight):
+def log_likelihood(
+    model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
+) -> torch.Tensor:
     r"""
-    Compute the log likelihood function :math:`\ln\mathcal{L}` between the complex data
+    Compute the log likelihood function of the data given a model.
+
+    :math:`\ln\mathcal{L}` is computed between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
 
-        \ln \mathcal{L}(\boldsymbol{V}|\,\boldsymbol{\theta}) = 
-        - \left ( N \ln 2 \pi +  \sum_i^N \sigma_i^2 + 
+        \ln \mathcal{L}(\boldsymbol{V}|\,\boldsymbol{\theta}) =
+        - \left ( N \ln 2 \pi +  \sum_i^N \sigma_i^2 +
         \frac{1}{2} \chi^2(\boldsymbol{V}|\,\boldsymbol{\theta}) \right )
 
     where :math:`\chi^2` is evaluated using :func:`mpol.losses.chi_squared`.
@@ -58,31 +73,41 @@ def log_likelihood(model_vis, data_vis, weight):
     will be different between the two cases. Inference of the parameter values should
     be unaffected. We recommend not including the Hermitian conjugates.
 
-    Args:
-        model_vis (PyTorch complex): array tuple of the model representing
-            :math:`\boldsymbol{V}`
-        data_vis (PyTorch complex): array of the data values representing :math:`M`
-        weight (PyTorch real): array of weight values representing :math:`w_i`
-
-    Returns:
-        torch.double: the :math:`\ln\mathcal{L}` log likelihood
+    Parameters
+    ----------
+    model_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the model values representing :math:`\boldsymbol{V}`
+    data_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the data values representing :math:`M`
+    weight : :class:`torch.Tensor` of :class:`torch.double`
+        array of weight values representing :math:`w_i`
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the :math:`\ln\mathcal{L}` log likelihood, summed over all dimensions
+        of input array.
     """
 
     # If model and data are multidimensional, then flatten them to get full N
     N = len(torch.ravel(data_vis))
 
-    sigma_term = torch.sum(1 / weight)
+    sigma_term: torch.Tensor = torch.sum(1 / weight)
 
-    return (
-        N * np.log(2 * np.pi)
-        + sigma_term
-        + 0.5 * chi_squared(model_vis, data_vis, weight)
-    )
+    # calculate separately so we can type as np, otherwise mypy thinks
+    # the expression is Any
+    first_term: np.float64 = N * np.log(2 * np.pi)
+
+    return first_term + sigma_term + 0.5 * chi_squared(model_vis, data_vis, weight)
 
 
-def nll(model_vis, data_vis, weight):
+def nll(
+    model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
+) -> torch.Tensor:
     r"""
-    Calculate a normalized "negative log likelihood" loss between the complex data
+    Calculate a normalized "negative log likelihood" loss between data and model.
+
+    The "negative log likelihood loss" is calculated between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
@@ -96,8 +121,8 @@ def nll(model_vis, data_vis, weight):
     a prefactor :math:`1/(2 N)`, where :math:`N` is the number of visibilities. The
     factor of 2 comes in because we must count real and imaginaries in the
     :math:`\chi^2` sum. This means that this normalized negative log likelihood loss
-    function will have a minimum value of $L_\mathrm{nll}(\hat{\boldsymbol{\theta}})
-    \approx 1$ for a well-fit model (regardless of the number of data points), making
+    function will have a minimum value of :math:`L_\mathrm{nll}(\hat{\boldsymbol{\theta}})
+    \approx 1` for a well-fit model (regardless of the number of data points), making
     it easier to set the prefactor strengths of other regularizers *relative* to this
     value.
 
@@ -107,14 +132,19 @@ def nll(model_vis, data_vis, weight):
     This is because the relative scaling of :math:`L_\mathrm{nll}` with respect to
     parameter value is incorrect.
 
-    Args:
-        model_vis (PyTorch complex): array tuple of the model representing
-            :math:`\boldsymbol{V}`
-        data_vis (PyTorch complex): array of the data values representing :math:`M`
-        weight (PyTorch real): array of weight values representing :math:`w_i`
-
-    Returns:
-        torch.double: the normalized negative log likelihood likelihood loss
+    Parameters
+    ----------
+    model_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the model values representing :math:`\boldsymbol{V}`
+    data_vis : :class:`torch.Tensor` of :class:`torch.complex`
+        array of the data values representing :math:`M`
+    weight : :class:`torch.Tensor` of :class:`torch.double`
+        array of weight values representing :math:`w_i`
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the normalized negative log likelihood likelihood loss, summed over all
+        dimensions of input array.
     """
 
     # If model and data are multidimensional, then flatten them to get full N
@@ -123,21 +153,27 @@ def nll(model_vis, data_vis, weight):
     return 1 / (2 * N) * chi_squared(model_vis, data_vis, weight)
 
 
-def chi_squared_gridded(modelVisibilityCube, griddedDataset):
+def chi_squared_gridded(
+    modelVisibilityCube: torch.Tensor, griddedDataset: GriddedDataset
+) -> torch.Tensor:
     r"""
     Calculate the :math:`\chi^2` (corresponding to :func:`~mpol.losses.chi_squared`)
     using gridded data and model visibilities.
 
-    Args:
-        modelVisibilityCube (torch complex tensor): torch tensor with shape
-            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from
-            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed,"
-            as in output from :meth:`mpol.fourier.FourierCube.forward()`.
-        griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
-
-    Returns:
-        torch.double: the :math:`\chi^2` value
-
+    Parameters
+    ----------
+    modelVisibilityCube : :class:`torch.Tensor` of :class:`torch.complex`
+        torch tensor with shape ``(nchan, npix, npix)`` to be indexed by the
+        ``mask`` from :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is
+        "pre-packed," as in output from :meth:`mpol.fourier.FourierCube.forward()`.
+    griddedDataset: :class:`~mpol.datasets.GriddedDataset` object
+        the gridded dataset, most likely produced from
+        :meth:`mpol.gridding.DataAverager.to_pytorch_dataset`
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the :math:`\chi^2` value, summed over all dimensions of input data.
     """
 
     # get the model_visibilities from the dataset
@@ -251,7 +287,7 @@ def TV_image(sky_cube, epsilon=1e-10):
 
     .. math::
 
-        L = \sum_{l,m,v} \sqrt{(I_{l + 1, m, v} - I_{l,m,v})^2 + 
+        L = \sum_{l,m,v} \sqrt{(I_{l + 1, m, v} - I_{l,m,v})^2 +
             (I_{l, m+1, v} - I_{l, m, v})^2 + \epsilon}
 
     """
@@ -441,7 +477,7 @@ def TSV(sky_cube):
 
     .. math::
 
-        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 + 
+        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 +
         (I_{l, m+1, v} - I_{l, m, v})^2
 
     """

From f119c3ed33701e32d4d6b4f8774f4ce56e218f6e Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 15:09:22 +0000
Subject: [PATCH 06/26] typed losses and added options to pyproject.

---
 mypy.ini => backup.mypy.ini |   0
 pyproject.toml              |  17 +-
 src/mpol/datasets.py        |  91 +++++-----
 src/mpol/images.py          |   4 +-
 src/mpol/losses.py          | 332 +++++++++++++++++++++---------------
 src/mpol/precomposed.py     |   3 +-
 6 files changed, 259 insertions(+), 188 deletions(-)
 rename mypy.ini => backup.mypy.ini (100%)

diff --git a/mypy.ini b/backup.mypy.ini
similarity index 100%
rename from mypy.ini
rename to backup.mypy.ini
diff --git a/pyproject.toml b/pyproject.toml
index 2a6ba427..e2274977 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -72,4 +72,19 @@ source = "vcs"
 version-file = "src/mpol/mpol_version.py"
 
 [tool.black]
-line-length = 88
\ No newline at end of file
+line-length = 88
+
+[tool.mypy]
+warn_return_any = true
+warn_unused_configs = true
+
+[[tool.mypy.overrides]]
+module = [
+    "astropy.*",
+    "matplotlib.*",
+    "scipy.*",
+    "torchkbnufft.*",
+    "frank.*",
+    "fast_histogram.*"
+]
+ignore_missing_imports = true
\ No newline at end of file
diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index 3754667a..c841617a 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -20,25 +20,25 @@
 class GriddedDataset(torch.nn.Module):
     r"""
     Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
-        vis_gridded (torch complex): the gridded visibility data stored in a "packed" 
+        vis_gridded (torch complex): the gridded visibility data stored in a "packed"
             format (pre-shifted for fft)
-        weight_gridded (torch double): the weights corresponding to the gridded 
+        weight_gridded (torch double): the weights corresponding to the gridded
             visibility data, also in a packed format
-        mask (torch boolean): a boolean mask to index the non-zero locations of 
+        mask (torch boolean): a boolean mask to index the non-zero locations of
             ``vis_gridded`` and ``weight_gridded`` in their packed format.
         nchan (int): the number of channels in the image (default = 1).
-        device (torch.device): the desired device of the dataset. If ``None``, 
+        device (torch.device): the desired device of the dataset. If ``None``,
             defaults to current device.
 
-    After initialization, the GriddedDataset provides the non-zero cells of the 
-    gridded visibilities and weights as a 1D vector via the following instance 
+    After initialization, the GriddedDataset provides the non-zero cells of the
+    gridded visibilities and weights as a 1D vector via the following instance
     variables. This means that any individual channel information has been collapsed.
 
     :ivar vis_indexed: 1D complex tensor of visibility data
     :ivar weight_indexed: 1D tensor of weight values
-    
+
     If you index the output of the Fourier layer in the same manner using ``self.mask``,
     then the model and data visibilities can be directly compared using a loss function.
     """
@@ -85,17 +85,17 @@ def from_image_properties(
         nchan: int = 1,
         device: torch.device | str | None = None,
     ):
-        """Alternative method to instantiate a GriddedDataset object from cell_size 
+        """Alternative method to instantiate a GriddedDataset object from cell_size
         and npix.
 
         Args:
             cell_size (float): the width of a pixel [arcseconds]
             npix (int): the number of pixels per image side
-            vis_gridded (torch complex): the gridded visibility data stored in a 
+            vis_gridded (torch complex): the gridded visibility data stored in a
                 "packed" format (pre-shifted for fft)
-            weight_gridded (torch double): the weights corresponding to the gridded 
+            weight_gridded (torch double): the weights corresponding to the gridded
                 visibility data, also in a packed format
-            mask (torch boolean): a boolean mask to index the non-zero locations of 
+            mask (torch boolean): a boolean mask to index the non-zero locations of
                 ``vis_gridded`` and ``weight_gridded`` in their packed format.
             nchan (int): the number of channels in the image (default = 1).
         """
@@ -112,12 +112,12 @@ def add_mask(
         mask: ArrayLike,
     ) -> None:
         r"""
-        Apply an additional mask to the data. Only works as a data limiting operation 
-        (i.e., ``mask`` is more restrictive than the mask already attached 
+        Apply an additional mask to the data. Only works as a data limiting operation
+        (i.e., ``mask`` is more restrictive than the mask already attached
         to the dataset).
 
         Args:
-            mask (2D numpy or PyTorch tensor): boolean mask (in packed format) to 
+            mask (2D numpy or PyTorch tensor): boolean mask (in packed format) to
                 apply to dataset. Assumes input will be broadcast across all channels.
         """
 
@@ -140,15 +140,15 @@ def add_mask(
         self.vis_indexed = self.vis_gridded[self.mask]
         self.weight_indexed = self.weight_gridded[self.mask]
 
-    def forward(self, modelVisibilityCube):
+    def forward(self, modelVisibilityCube: torch.Tensor) -> torch.Tensor:
         """
         Args:
-            modelVisibilityCube (complex torch.tensor): with shape 
-                ``(nchan, npix, npix)`` to be indexed. In "pre-packed" format, as in 
+            modelVisibilityCube (complex torch.tensor): with shape
+                ``(nchan, npix, npix)`` to be indexed. In "pre-packed" format, as in
                 output from :meth:`mpol.fourier.FourierCube.forward()`
 
         Returns:
-            torch complex tensor:  1d torch tensor of indexed model samples collapsed 
+            torch complex tensor:  1d torch tensor of indexed model samples collapsed
                 across cube dimensions.
         """
 
@@ -178,23 +178,24 @@ def ground_mask(self) -> torch.Tensor:
         """
         return utils.packed_cube_to_ground_cube(self.mask)
 
+
 class Dartboard:
     r"""
-    A polar coordinate grid relative to a :class:`~mpol.coordinates.GridCoords` object, 
-    reminiscent of a dartboard layout. The main utility of this object is to support 
+    A polar coordinate grid relative to a :class:`~mpol.coordinates.GridCoords` object,
+    reminiscent of a dartboard layout. The main utility of this object is to support
     splitting a dataset along radial and azimuthal bins for k-fold cross validation.
 
     Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
-        q_edges (1D numpy array): an array of radial bin edges to set the dartboard 
-            cells in :math:`[\mathrm{k}\lambda]`. If ``None``, defaults to 12 
-            log-linearly radial bins stretching from 0 to the :math:`q_\mathrm{max}` 
+        q_edges (1D numpy array): an array of radial bin edges to set the dartboard
+            cells in :math:`[\mathrm{k}\lambda]`. If ``None``, defaults to 12
+            log-linearly radial bins stretching from 0 to the :math:`q_\mathrm{max}`
             represented by ``coords``.
-        phi_edges (1D numpy array): an array of azimuthal bin edges to set the 
-            dartboard cells in [radians], over the domain :math:`[0, \pi]`, which is 
-            also implicitly mapped to the domain :math:`[-\pi, \pi]` to preserve the 
-            Hermitian nature of the visibilities. If ``None``, defaults to 
+        phi_edges (1D numpy array): an array of azimuthal bin edges to set the
+            dartboard cells in [radians], over the domain :math:`[0, \pi]`, which is
+            also implicitly mapped to the domain :math:`[-\pi, \pi]` to preserve the
+            Hermitian nature of the visibilities. If ``None``, defaults to
             8 equal-spaced azimuthal bins stretched from :math:`0` to :math:`\pi`.
     """
 
@@ -217,11 +218,11 @@ def __init__(
             # set q edges approximately following inspiration from Petry et al. scheme:
             # https://ui.adsabs.harvard.edu/abs/2020SPIE11449E..1DP/abstract
             # first two bins set to 7m width
-            # after third bin, bin width increases linearly until it is 
+            # after third bin, bin width increases linearly until it is
             # 700m at 16km baseline.
             # From 16m to 16km, bin width goes from 7m to 700m.
             # ---
-            # We aren't doing *quite* the same thing, 
+            # We aren't doing *quite* the same thing,
             # just logspacing with a few linear cells at the start.
             q_edges = loglinspace(0, self.q_max, N_log=8, M_linear=5)
 
@@ -254,14 +255,14 @@ def from_image_properties(
         Args:
             cell_size (float): the width of a pixel [arcseconds]
             npix (int): the number of pixels per image side
-            q_edges (1D numpy array): an array of radial bin edges to set the 
-                dartboard cells in :math:`[\mathrm{k}\lambda]`. If ``None``, defaults 
-                to 12 log-linearly radial bins stretching from 0 to the 
+            q_edges (1D numpy array): an array of radial bin edges to set the
+                dartboard cells in :math:`[\mathrm{k}\lambda]`. If ``None``, defaults
+                to 12 log-linearly radial bins stretching from 0 to the
                 :math:`q_\mathrm{max}` represented by ``coords``.
-            phi_edges (1D numpy array): an array of azimuthal bin edges to set the 
-                dartboard cells in [radians], over the domain :math:`[0, \pi]`, which 
-                is also implicitly mapped to the domain :math:`[-\pi, \pi]` to preserve 
-                the Hermitian nature of the visibilities. If ``None``, defaults to 8 
+            phi_edges (1D numpy array): an array of azimuthal bin edges to set the
+                dartboard cells in [radians], over the domain :math:`[0, \pi]`, which
+                is also implicitly mapped to the domain :math:`[-\pi, \pi]` to preserve
+                the Hermitian nature of the visibilities. If ``None``, defaults to 8
                 equal-spaced azimuthal bins stretched from :math:`0` to :math:`\pi`.
         """
         coords = GridCoords(cell_size, npix)
@@ -271,17 +272,17 @@ def get_polar_histogram(
         self, qs: NDArray[floating[Any]], phis: NDArray[floating[Any]]
     ) -> NDArray[floating[Any]]:
         r"""
-        Calculate a histogram in polar coordinates, using the bin edges defined by 
+        Calculate a histogram in polar coordinates, using the bin edges defined by
         ``q_edges`` and ``phi_edges`` during initialization.
         Data coordinates should include the points for the Hermitian visibilities.
 
         Args:
             qs: 1d array of q values :math:`[\mathrm{k}\lambda]`
-            phis: 1d array of datapoint azimuth values [radians] (must be the same 
+            phis: 1d array of datapoint azimuth values [radians] (must be the same
                 length as qs)
 
         Returns:
-            2d integer numpy array of cell counts, i.e., how many datapoints fell into 
+            2d integer numpy array of cell counts, i.e., how many datapoints fell into
             each dartboard cell.
         """
 
@@ -297,13 +298,13 @@ def get_nonzero_cell_indices(
         self, qs: NDArray[floating[Any]], phis: NDArray[floating[Any]]
     ) -> NDArray[integer[Any]]:
         r"""
-        Return a list of the cell indices that contain data points, using the bin edges 
+        Return a list of the cell indices that contain data points, using the bin edges
         defined by ``q_edges`` and ``phi_edges`` during initialization.
         Data coordinates should include the points for the Hermitian visibilities.
 
         Args:
             qs: 1d array of q values :math:`[\mathrm{k}\lambda]`
-            phis: 1d array of datapoint azimuth values [radians] (must be the same 
+            phis: 1d array of datapoint azimuth values [radians] (must be the same
                 length as qs)
 
         Returns:
@@ -321,8 +322,8 @@ def build_grid_mask_from_cells(
         self, cell_index_list: NDArray[integer[Any]]
     ) -> NDArray[np.bool_]:
         r"""
-        Create a boolean mask of size ``(npix, npix)`` (in packed format) corresponding 
-        to the ``vis_gridded`` and ``weight_gridded`` quantities of the 
+        Create a boolean mask of size ``(npix, npix)`` (in packed format) corresponding
+        to the ``vis_gridded`` and ``weight_gridded`` quantities of the
         :class:`~mpol.datasets.GriddedDataset` .
 
         Args:
diff --git a/src/mpol/images.py b/src/mpol/images.py
index 85098be3..dcfdbbbd 100644
--- a/src/mpol/images.py
+++ b/src/mpol/images.py
@@ -8,8 +8,8 @@
 import torch.fft  # to avoid conflicts with old torch.fft *function*
 from torch import nn
 
-from . import utils
-from .coordinates import GridCoords
+from mpol import utils
+from mpol.coordinates import GridCoords
 
 
 class BaseCube(nn.Module):
diff --git a/src/mpol/losses.py b/src/mpol/losses.py
index 6817b95a..6cc6d989 100644
--- a/src/mpol/losses.py
+++ b/src/mpol/losses.py
@@ -9,19 +9,20 @@
 """
 
 import numpy as np
-import numpy.typing as npt
 import torch
 
-from .datasets import GriddedDataset
+
+from mpol import constants
+from mpol.datasets import GriddedDataset
+from typing import Optional
 
 
 def chi_squared(
     model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
 ) -> torch.Tensor:
-    r"""Computes :math:`\chi^2` between data and model.
-
-    The :math:`\chi^2` between the complex data :math:`\boldsymbol{V}` and model
-    :math:`M` visibilities is computed as
+    r"""
+    Computes the :math:`\chi^2` between the complex data :math:`\boldsymbol{V}` and
+    model :math:`M` visibilities using
 
     .. math::
 
@@ -55,9 +56,7 @@ def log_likelihood(
     model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
 ) -> torch.Tensor:
     r"""
-    Compute the log likelihood function of the data given a model.
-
-    :math:`\ln\mathcal{L}` is computed between the complex data
+    Compute the log likelihood function :math:`\ln\mathcal{L}` between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
@@ -105,9 +104,7 @@ def nll(
     model_vis: torch.Tensor, data_vis: torch.Tensor, weight: torch.Tensor
 ) -> torch.Tensor:
     r"""
-    Calculate a normalized "negative log likelihood" loss between data and model.
-
-    The "negative log likelihood loss" is calculated between the complex data
+    Calculate a normalized "negative log likelihood" loss between the complex data
     :math:`\boldsymbol{V}` and model :math:`M` visibilities using
 
     .. math::
@@ -187,21 +184,28 @@ def chi_squared_gridded(
     )
 
 
-def log_likelihood_gridded(modelVisibilityCube, griddedDataset):
+def log_likelihood_gridded(
+    modelVisibilityCube: torch.Tensor, griddedDataset: GriddedDataset
+) -> torch.Tensor:
     r"""
-    Calculate the log likelihood function :math:`\ln\mathcal{L}` (corresponding to
-    :func:`~mpol.losses.log_likelihood`) using gridded data and model visibilities.
 
-    Args:
-        modelVisibilityCube (torch complex tensor): torch tensor with shape
-            ``(nchan, npix, npix)`` to be indexed by the ``mask`` from
-            :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is "pre-packed," as
-            in output from :meth:`mpol.fourier.FourierCube.forward()`.
-        griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
+    Calculate :math:`\ln\mathcal{L}` (corresponding to
+    :func:`~mpol.losses.log_likelihood`) using gridded quantities.
 
-    Returns:
-        torch.double: the :math:`\ln\mathcal{L}` value
+    Parameters
+    ----------
+    modelVisibilityCube : :class:`torch.Tensor` of :class:`torch.complex`
+        torch tensor with shape ``(nchan, npix, npix)`` to be indexed by the
+        ``mask`` from :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is
+        "pre-packed," as in output from :meth:`mpol.fourier.FourierCube.forward()`.
+    griddedDataset: :class:`~mpol.datasets.GriddedDataset` object
+        the gridded dataset, most likely produced from
+        :meth:`mpol.gridding.DataAverager.to_pytorch_dataset`
 
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the :math:`\ln\mathcal{L}` value, summed over all dimensions of input data.
     """
 
     # get the model_visibilities from the dataset
@@ -214,48 +218,63 @@ def log_likelihood_gridded(modelVisibilityCube, griddedDataset):
     )
 
 
-def nll_gridded(modelVisibilityCube, griddedDataset):
+def nll_gridded(
+    modelVisibilityCube: torch.Tensor, griddedDataset: GriddedDataset
+) -> torch.Tensor:
     r"""
+
     Calculate a normalized "negative log likelihood" (corresponding to
     :func:`~mpol.losses.nll`) using gridded data and model visibilities. Function will
     return the same value regardless of whether Hermitian pairs are included.
 
-    Args:
-        vis (torch complex tensor): torch tensor with shape ``(nchan, npix, npix)`` to
-            be indexed by the ``mask`` from :class:`~mpol.datasets.GriddedDataset`.
-            Assumes tensor is "pre-packed," as in output from
-            :meth:`mpol.fourier.FourierCube.forward()`.
-        griddedDataset: instantiated :class:`~mpol.datasets.GriddedDataset` object
+    Parameters
+    ----------
+    modelVisibilityCube : :class:`torch.Tensor` of :class:`torch.complex`
+        torch tensor with shape ``(nchan, npix, npix)`` to be indexed by the
+        ``mask`` from :class:`~mpol.datasets.GriddedDataset`. Assumes tensor is
+        "pre-packed," as in output from :meth:`mpol.fourier.FourierCube.forward()`.
+    griddedDataset: :class:`~mpol.datasets.GriddedDataset` object
+        the gridded dataset, most likely produced from
+        :meth:`mpol.gridding.DataAverager.to_pytorch_dataset`
 
-    Returns:
-        torch.double: the normalized negative log likelihood likelihood loss
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the normalized negative log likelihood likelihood loss, summed over all input
+        values
     """
     model_vis = griddedDataset(modelVisibilityCube)
 
     return nll(model_vis, griddedDataset.vis_indexed, griddedDataset.weight_indexed)
 
 
-def entropy(cube, prior_intensity, tot_flux=10):
+def entropy(
+    cube: torch.Tensor, prior_intensity: torch.Tensor, tot_flux: float = 10
+) -> torch.Tensor:
     r"""
     Calculate the entropy loss of a set of pixels following the definition in
     `EHT-IV 2019 <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_.
 
-    Args:
-        cube (any tensor): pixel values must be positive :math:`I_i > 0`
-            for all :math:`i`
-        prior_intensity (any tensor): the prior value :math:`p` to calculate entropy
-            against. Could be a single constant or an array the same shape as image.
-        tot_flux (float): a fixed normalization factor; the user-defined target total
-            flux density
-
-    Returns:
-        torch.double: entropy loss
-
-    The entropy loss is calculated as
-
     .. math::
 
         L = \frac{1}{\zeta} \sum_i I_i \; \ln \frac{I_i}{p_i}
+
+    Parameters
+    ----------
+    cube : :class:`torch.Tensor` of :class:`torch.double`
+        pixel values must be positive :math:`I_i > 0` for all :math:`i`
+    prior_intensity : :class:`torch.Tensor` of :class:`torch.double`
+        the prior value :math:`p` to calculate entropy against. Tensors of any shape
+        are allowed so long as they will broadcast to the shape of the cube under
+        division (`/`).
+    tot_flux : float
+        a fixed normalization factor; the user-defined target total flux density, in
+        units of Jy.
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        entropy loss
     """
     # check to make sure image is positive, otherwise raise an error
     assert (cube >= 0.0).all(), "image cube contained negative pixel values"
@@ -265,31 +284,35 @@ def entropy(cube, prior_intensity, tot_flux=10):
     return (1 / tot_flux) * torch.sum(cube * torch.log(cube / prior_intensity))
 
 
-def TV_image(sky_cube, epsilon=1e-10):
+def TV_image(sky_cube: torch.Tensor, epsilon: float = 1e-10) -> torch.Tensor:
     r"""
     Calculate the total variation (TV) loss in the image dimension (R.A. and DEC).
     Following the definition in `EHT-IV 2019
     <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the
     image to be piecewise smooth and the gradient of the image to be sparse.
 
-    Args:
-        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l`
-            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
-            :math:`v` is the channel (velocity or frequency) dimension in
-            :math:`ndim=1`. Should be in sky format representation.
-        epsilon (float): a softening parameter in
-            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any pixel-to-pixel variations
-            within each image slice greater than this parameter will have a
-            significant penalty.
-
-    Returns:
-        torch.double: total variation loss
-
     .. math::
 
         L = \sum_{l,m,v} \sqrt{(I_{l + 1, m, v} - I_{l,m,v})^2 +
             (I_{l, m+1, v} - I_{l, m, v})^2 + \epsilon}
 
+
+    Parameters
+    ----------
+    sky_cube: 3D :class:`torch.Tensor` of :class:`torch.double`
+        the image cube array :math:`I_{lmv}`, where :math:`l`
+        is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
+        :math:`v` is the channel (velocity or frequency) dimension in
+        :math:`ndim=1`. Should be in sky format representation.
+    epsilon : float
+        a softening parameter in units of [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`].
+        Any pixel-to-pixel variations within each image North-South or East-West
+        slice greater than this parameter will incur a significant penalty.
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        total variation loss
     """
 
     # diff the cube in ll and remove the last row
@@ -303,25 +326,29 @@ def TV_image(sky_cube, epsilon=1e-10):
     return loss
 
 
-def TV_channel(cube, epsilon=1e-10):
+def TV_channel(cube: torch.Tensor, epsilon: float = 1e-10) -> torch.Tensor:
     r"""
-    Calculate the total variation (TV) loss in the channel dimension. Following the
-    definition in `EHT-IV 2019
-    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_.
-
-    Args:
-        cube (any 3D tensor): the image cube array :math:`I_{lmv}`
-        epsilon (float): a softening parameter in
-            [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`]. Any channel-to-channel pixel
-            variations greater than this parameter will have a significant penalty.
-
-    Returns:
-        torch.double: total variation loss
+    Calculate the total variation (TV) loss in the channel (first) dimension.
+    Following the definition in `EHT-IV 2019
+    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_, calculate
 
     .. math::
 
         L = \sum_{l,m,v} \sqrt{(I_{l, m, v + 1} - I_{l,m,v})^2 + \epsilon}
 
+    Parameters
+    ----------
+    cube: :class:`torch.Tensor` of :class:`torch.double`
+        the image cube array :math:`I_{lmv}`
+    epsilon: float
+        a softening parameter in units of [:math:`\mathrm{Jy}/\mathrm{arcsec}^2`].
+        Any channel-to-channel pixel variations greater than this parameter will incur
+        a significant penalty.
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        total variation loss
     """
     # calculate the difference between the n+1 cube and the n cube
     diff_vel = cube[1:] - cube[0:-1]
@@ -330,48 +357,71 @@ def TV_channel(cube, epsilon=1e-10):
     return loss
 
 
-def edge_clamp(cube):
+def TSV(sky_cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Promote all pixels at the edge of the image to be zero using an :math:`L_2` norm.
+    Calculate the total square variation (TSV) loss in the image dimension
+    (R.A. and DEC). Following the definition in `EHT-IV 2019
+    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the
+    image to be edge smoothed which may be a better reoresentation of the truth image
+    `K. Kuramochi et al 2018
+    <https://ui.adsabs.harvard.edu/abs/2018ApJ...858...56K/abstract>`_.
 
-    Args:
-        cube (any 3D tensor): the array and pixel values
+    .. math::
+
+        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 +
+        (I_{l, m+1, v} - I_{l, m, v})^2
+
+    Parameters
+    ----------
+    sky_cube :class:`torch.Tensor` of :class:`torch.double`
+        the image cube array :math:`I_{lmv}`, where :math:`l`
+        is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
+        :math:`v` is the channel (velocity or frequency) dimension in
+        :math:`ndim=1`. Should be in sky format representation.
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        total square variation loss
 
-    Returns:
-        torch.double: edge loss
     """
 
-    # find edge pixels
-    # all channels
-    # pixel edges
-    bt_edges = cube[:, (0, -1)]
-    lr_edges = cube[:, :, (0, -1)]
+    # diff the cube in ll and remove the last row
+    diff_ll = sky_cube[:, 0:-1, 1:] - sky_cube[:, 0:-1, 0:-1]
 
-    loss = torch.sum(bt_edges**2) + torch.sum(lr_edges**2)
+    # diff the cube in mm and remove the last column
+    diff_mm = sky_cube[:, 1:, 0:-1] - sky_cube[:, 0:-1, 0:-1]
+
+    loss = torch.sum(diff_ll**2 + diff_mm**2)
 
     return loss
 
 
-def sparsity(cube, mask=None):
+def sparsity(cube: torch.Tensor, mask: Optional[torch.Tensor] = None) -> torch.Tensor:
     r"""
     Enforce a sparsity prior on the image cube using the :math:`L_1` norm. Optionally
     provide a boolean mask to apply the prior to only the ``True`` locations. For
     example, you might want this mask to be ``True`` for background regions.
 
-    Args:
-        cube (nchan, npix, npix): tensor image cube
-        mask (boolean): tensor array the same shape as ``cube``. The sparsity prior
-            will be applied to those pixels where the mask is ``True``. Default is
-            to apply prior to all pixels.
-
-    Returns:
-        torch.double: sparsity loss calculated where ``mask == True``
-
     The sparsity loss calculated as
 
     .. math::
 
         L = \sum_i | I_i |
+
+    Parameters
+    ----------
+    cube : :class:`torch.Tensor` of :class:`torch.double`
+        the image cube array :math:`I_{lmv}`
+    mask : :class:`torch.Tensor` of :class:`torch.bool`
+        tensor array the same shape as ``cube``. The sparsity prior
+        will be applied to those pixels where the mask is ``True``. Default is
+        to apply prior to all pixels.
+
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        sparsity loss calculated where ``mask == True``
     """
 
     if mask is not None:
@@ -382,19 +432,27 @@ def sparsity(cube, mask=None):
     return loss
 
 
-def UV_sparsity(vis, qs, q_max):
+def UV_sparsity(
+    vis: torch.Tensor, qs: torch.Tensor, q_max: torch.Tensor
+) -> torch.Tensor:
     r"""
     Enforce a sparsity prior for all :math:`q = \sqrt{u^2 + v^2}` points larger than
     :math:`q_\mathrm{max}`.
 
-    Args:
-        vis (torch.double) : visibility cube of (nchan, npix, npix//2 +1, 2)
-        qs: numpy array corresponding to visibility coordinates. Dimensionality of
-            (npix, npix//2)
-        q_max (float): maximum radial baseline
+    Parameters
+    ----------
+    vis : :class:`torch.Tensor` of :class:`torch.complex128`
+        visibility cube of (nchan, npix, npix//2 +1, 2)
+    qs : :class:`torch.Tensor` of :class:`torch.float64`
+        array corresponding to visibility coordinates. Dimensionality of
+        (npix, npix//2)
+    q_max : float
+        maximum radial baseline
 
-    Returns:
-        torch.double: UV sparsity loss above :math:`q_\mathrm{max}`
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        UV sparsity loss above :math:`q_\mathrm{max}`
     """
 
     # make a mask, then send it to the device (in case we're using a GPU)
@@ -413,7 +471,7 @@ def UV_sparsity(vis, qs, q_max):
     return loss
 
 
-def PSD(qs, psd, l):
+def PSD(qs: torch.Tensor, psd: torch.Tensor, l: torch.Tensor) -> torch.Tensor:
     r"""
     Apply a loss function corresponding to the power spectral density using a Gaussian
     process kernel.
@@ -422,29 +480,35 @@ def PSD(qs, psd, l):
 
     .. math::
 
-        k(r) = exp(-\frac{r^2}{2 \ell^2})
+        k(r) = \exp(-\frac{r^2}{2 \ell^2})
 
     The corresponding power spectral density is
 
     .. math::
 
-        P(q) = (2 \pi \ell^2) exp(- 2 \pi^2 \ell^2 q^2)
+        P(q) = (2 \pi \ell^2) \exp(- 2 \pi^2 \ell^2 q^2)
 
 
-    Args:
-        qs (torch.double): the radial UV coordinate (in kilolambda)
-        psd (torch.double): the power spectral density cube
-        l (torch.double): the correlation length in the image plane (in arcsec)
+    Parameters
+    ----------
+    qs : :class:`torch.Tensor` of :class:`torch.double`
+        the radial UV coordinate (in kilolambda)
+    psd : :class:`torch.Tensor` of :class:`torch.double`
+        the power spectral density cube
+    l : :class:`torch.Tensor` of :class:`torch.double`
+        the correlation length in the image plane (in arcsec)
 
-    Returns:
-        torch.double : the loss calculated using the power spectral density
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        the loss calculated using the power spectral density
 
     """
 
     # stack to the full 3D shape
     qs = qs * 1e3  # lambda
 
-    l_rad = l * arcsec  # radians
+    l_rad = l * constants.arcsec  # radians
 
     # calculate the expected power spectral density
     expected_PSD = (
@@ -457,37 +521,27 @@ def PSD(qs, psd, l):
     return loss
 
 
-def TSV(sky_cube):
+def edge_clamp(cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Calculate the total square variation (TSV) loss in the image dimension
-    (R.A. and DEC). Following the definition in `EHT-IV 2019
-    <https://ui.adsabs.harvard.edu/abs/2019ApJ...875L...4E/abstract>`_ Promotes the
-    image to be edge smoothed which may be a better reoresentation of the truth image
-    `K. Kuramochi et al 2018
-    <https://ui.adsabs.harvard.edu/abs/2018ApJ...858...56K/abstract>`_.
-
-    Args:
-        sky_cube (any 3D tensor): the image cube array :math:`I_{lmv}`, where :math:`l`
-            is R.A. in :math:`ndim=3`, :math:`m` is DEC in :math:`ndim=2`, and
-            :math:`v` is the channel (velocity or frequency) dimension in
-            :math:`ndim=1`. Should be in sky format representation.
-
-    Returns:
-        torch.double: total square variation loss
-
-    .. math::
+    Promote all pixels at the edge of the image to be zero using an :math:`L_2` norm.
 
-        L = \sum_{l,m,v} (I_{l + 1, m, v} - I_{l,m,v})^2 +
-        (I_{l, m+1, v} - I_{l, m, v})^2
+    Parameters
+    ----------
+    cube: :class:`torch.Tensor` of :class:`torch.double`
+        the image cube array :math:`I_{lmv}`
 
+    Returns
+    -------
+    :class:`torch.Tensor` of :class:`torch.double`
+        edge loss
     """
 
-    # diff the cube in ll and remove the last row
-    diff_ll = sky_cube[:, 0:-1, 1:] - sky_cube[:, 0:-1, 0:-1]
-
-    # diff the cube in mm and remove the last column
-    diff_mm = sky_cube[:, 1:, 0:-1] - sky_cube[:, 0:-1, 0:-1]
+    # find edge pixels
+    # all channels
+    # pixel edges
+    bt_edges = cube[:, (0, -1)]
+    lr_edges = cube[:, :, (0, -1)]
 
-    loss = torch.sum(diff_ll**2 + diff_mm**2)
+    loss = torch.sum(bt_edges**2) + torch.sum(lr_edges**2)
 
     return loss
diff --git a/src/mpol/precomposed.py b/src/mpol/precomposed.py
index b774f125..c7e3b539 100644
--- a/src/mpol/precomposed.py
+++ b/src/mpol/precomposed.py
@@ -2,7 +2,8 @@
 
 from mpol.coordinates import GridCoords
 
-from . import fourier, images
+from mpol import fourier
+from mpol import images
 
 
 class SimpleNet(torch.nn.Module):

From 0ba0b5ceaf9941ce0a9f80e8292026d108b98906 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 15:13:12 +0000
Subject: [PATCH 07/26] added disallow_untyped_defs for losses to prevent
 regression from 100% typed.

---
 backup.mypy.ini | 21 ---------------------
 pyproject.toml  |  6 +++++-
 2 files changed, 5 insertions(+), 22 deletions(-)
 delete mode 100644 backup.mypy.ini

diff --git a/backup.mypy.ini b/backup.mypy.ini
deleted file mode 100644
index 19b0da10..00000000
--- a/backup.mypy.ini
+++ /dev/null
@@ -1,21 +0,0 @@
-[mypy]
-warn_return_any = True
-warn_unused_configs = True
-
-[mypy-astropy.*]
-ignore_missing_imports = True
-
-[mypy-matplotlib.*]
-ignore_missing_imports = True
-
-[mypy-scipy.*]
-ignore_missing_imports = True
-
-[mypy-torchkbnufft.*]
-ignore_missing_imports = True
-
-[mypy-frank.*]
-ignore_missing_imports = True
-
-[mypy-fast_histogram.*]
-ignore_missing_imports = True
\ No newline at end of file
diff --git a/pyproject.toml b/pyproject.toml
index e2274977..d6a05c47 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -87,4 +87,8 @@ module = [
     "frank.*",
     "fast_histogram.*"
 ]
-ignore_missing_imports = true
\ No newline at end of file
+ignore_missing_imports = true
+
+[[tool.mypy.overrides]]
+module = "MPoL.losses"
+disallow_untyped_defs = true
\ No newline at end of file

From 5d05caaf7bef5d1cff8257768e0d16ed9dfa4ed5 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 15:20:29 +0000
Subject: [PATCH 08/26] removed unnecessary imports for datasets.

---
 src/mpol/coordinates.py | 3 +--
 src/mpol/datasets.py    | 9 ++-------
 2 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/src/mpol/coordinates.py b/src/mpol/coordinates.py
index cfa92b2d..e52b6b6f 100644
--- a/src/mpol/coordinates.py
+++ b/src/mpol/coordinates.py
@@ -8,8 +8,7 @@
 
 import mpol.constants as const
 from mpol.exceptions import CellSizeError
-
-from .utils import get_max_spatial_freq, get_maximum_cell_size
+from mpol.utils import get_max_spatial_freq, get_maximum_cell_size
 
 
 class GridCoords:
diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index c841617a..cdb2486d 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -1,20 +1,15 @@
 from __future__ import annotations
 
-import copy
 from typing import Any
 
 import numpy as np
 import torch
-import torch.utils.data as torch_ud
 from numpy import floating, integer
 from numpy.typing import ArrayLike, NDArray
 
 from mpol.coordinates import GridCoords
-from mpol.exceptions import WrongDimensionError
 
-from . import utils
-from .constants import *
-from .utils import loglinspace
+from mpol import utils
 
 
 class GriddedDataset(torch.nn.Module):
@@ -224,7 +219,7 @@ def __init__(
             # ---
             # We aren't doing *quite* the same thing,
             # just logspacing with a few linear cells at the start.
-            q_edges = loglinspace(0, self.q_max, N_log=8, M_linear=5)
+            q_edges = utils.loglinspace(0, self.q_max, N_log=8, M_linear=5)
 
         self.q_edges = q_edges
         self.phi_edges = phi_edges

From b0dc5772da8f99e08b2eb4037957dabed3f7e688 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 15:23:29 +0000
Subject: [PATCH 09/26] removed unnecessary imports for datasets.

---
 src/mpol/datasets.py | 28 ++++++++++++++++------------
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index cdb2486d..50c78985 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -14,18 +14,22 @@
 
 class GriddedDataset(torch.nn.Module):
     r"""
-    Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class.
-            If providing this, cannot provide ``cell_size`` or ``npix``.
-        vis_gridded (torch complex): the gridded visibility data stored in a "packed"
-            format (pre-shifted for fft)
-        weight_gridded (torch double): the weights corresponding to the gridded
-            visibility data, also in a packed format
-        mask (torch boolean): a boolean mask to index the non-zero locations of
-            ``vis_gridded`` and ``weight_gridded`` in their packed format.
-        nchan (int): the number of channels in the image (default = 1).
-        device (torch.device): the desired device of the dataset. If ``None``,
-            defaults to current device.
+    Parameters
+    ----------
+    coords : :class:`~mpol.coordinates.GridCoords`
+        If providing this, cannot provide ``cell_size`` or ``npix``.
+    vis_gridded : :class:`torch.Tensor` of :class:`torch.complex128`
+        the gridded visibility data stored in a "packed" format (pre-shifted for fft)
+    weight_gridded : :class:`torch.Tensor` of :class:`torch.double`
+        the weights corresponding to the gridded visibility data,
+        also in a packed format
+    mask : :class:`torch.Tensor` of :class:`torch.bool`
+        a boolean mask to index the non-zero locations of ``vis_gridded`` and
+        ``weight_gridded`` in their packed format.
+    nchan : int
+        the number of channels in the image (default = 1).
+    device : :class:`torch.device`
+        the desired device of the dataset. If ``None``, defaults to current device.
 
     After initialization, the GriddedDataset provides the non-zero cells of the
     gridded visibilities and weights as a 1D vector via the following instance

From 24c2db3daa6b24ec1f612491ab81038fd266f9f4 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 16:31:22 +0000
Subject: [PATCH 10/26] return type for from_image.

---
 src/mpol/datasets.py | 41 ++++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index 50c78985..ec0bb623 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -28,16 +28,17 @@ class GriddedDataset(torch.nn.Module):
         ``weight_gridded`` in their packed format.
     nchan : int
         the number of channels in the image (default = 1).
-    device : :class:`torch.device`
-        the desired device of the dataset. If ``None``, defaults to current device.
+
 
     After initialization, the GriddedDataset provides the non-zero cells of the
     gridded visibilities and weights as a 1D vector via the following instance
     variables. This means that any individual channel information has been collapsed.
 
     :ivar vis_indexed: 1D complex tensor of visibility data
+
     :ivar weight_indexed: 1D tensor of weight values
 
+
     If you index the output of the Fourier layer in the same manner using ``self.mask``,
     then the model and data visibilities can be directly compared using a loss function.
     """
@@ -82,21 +83,27 @@ def from_image_properties(
         weight_gridded: torch.Tensor,
         mask: torch.Tensor,
         nchan: int = 1,
-        device: torch.device | str | None = None,
-    ):
-        """Alternative method to instantiate a GriddedDataset object from cell_size
-        and npix.
-
-        Args:
-            cell_size (float): the width of a pixel [arcseconds]
-            npix (int): the number of pixels per image side
-            vis_gridded (torch complex): the gridded visibility data stored in a
-                "packed" format (pre-shifted for fft)
-            weight_gridded (torch double): the weights corresponding to the gridded
-                visibility data, also in a packed format
-            mask (torch boolean): a boolean mask to index the non-zero locations of
-                ``vis_gridded`` and ``weight_gridded`` in their packed format.
-            nchan (int): the number of channels in the image (default = 1).
+    ) -> GriddedDataset:
+        """
+            Alternative method to instantiate a GriddedDataset object from cell_size
+            and npix.
+
+            Parameters
+            ----------
+            cell_size : float 
+                the width of a pixel [arcseconds]
+            npix : int
+                the number of pixels per image side
+            vis_gridded : :class:`torch.Tensor` of :class:`torch.complex128`
+                the gridded visibility data stored in a "packed" format (pre-shifted for fft)
+            weight_gridded : :class:`torch.Tensor` of :class:`torch.double`
+                the weights corresponding to the gridded visibility data,
+                also in a packed format
+            mask : :class:`torch.Tensor` of :class:`torch.bool`
+                a boolean mask to index the non-zero locations of ``vis_gridded`` and
+                ``weight_gridded`` in their packed format.
+            nchan : int
+                the number of channels in the image (default = 1).
         """
         return cls(
             coords=GridCoords(cell_size, npix),

From bad165cef808b16ce713de7a1da94d01ea2d6df6 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 18:33:50 +0000
Subject: [PATCH 11/26] locking down additional modules as 100% coverage.

---
 pyproject.toml      |  6 +++++-
 src/mpol/fourier.py | 26 +++++++++++---------------
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index d6a05c47..637b0b5f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -90,5 +90,9 @@ module = [
 ignore_missing_imports = true
 
 [[tool.mypy.overrides]]
-module = "MPoL.losses"
+module = [
+    "MPol.constants",
+    "MPoL.losses",
+    "MPoL.datasets"
+]
 disallow_untyped_defs = true
\ No newline at end of file
diff --git a/src/mpol/fourier.py b/src/mpol/fourier.py
index d3ef9a74..4d9a41c0 100644
--- a/src/mpol/fourier.py
+++ b/src/mpol/fourier.py
@@ -17,8 +17,8 @@
 from mpol.images import ImageCube
 from mpol.protocols import MPoLModel
 
-from . import utils
-from .coordinates import GridCoords
+from mpol import utils
+from mpol.coordinates import GridCoords
 
 
 class FourierCube(nn.Module):
@@ -30,25 +30,21 @@ class FourierCube(nn.Module):
     :func:`mpol.losses.nll_gridded`) and a gridded dataset (e.g.,
     :class:`mpol.datasets.GriddedDataset`).
 
-    Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class.
-        persistent_vis (Boolean): should the visibility cube be stored as part of 
-            the module  s `state_dict`? If `True`, the state of the UV grid will be 
-            stored. It is recommended to use `False` for most applications, since the 
-            visibility cube will rarely be a direct parameter of the model.
+    Parameters
+    ----------
+    coords : :class:`~mpol.coordinates.GridCoords` 
+        object containing image dimensions
+    persistent_vis : bool
+        should the visibility cube be stored as part of 
+        the module  s `state_dict`? If `True`, the state of the UV grid will be 
+        stored. It is recommended to use `False` for most applications, since the 
+        visibility cube will rarely be a direct parameter of the model.
 
     """
 
     def __init__(self, coords: GridCoords, persistent_vis: bool = False):
         super().__init__()
 
-        # TODO: Is this comment relevant? There was no nchan instantiation
-        # before
-        # ---
-        # we don't want to bother with the nchan argument here, so
-        # we don't use the convenience method _setup_coords
-        # and just do it manually
-
         self.coords = coords
 
         self.register_buffer("vis", None, persistent=persistent_vis)

From 86a9a909d476ae986e5df9da6b0b5ddf50bf742e Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 21:01:01 +0000
Subject: [PATCH 12/26] update tested Python versions to be compatile with
 minimum version of Spec 0.

---
 .github/workflows/docs_build.yml  |   2 +-
 .github/workflows/gh_docs.yml     |   2 +-
 .github/workflows/package.yml     |   2 +-
 .github/workflows/pre-release.yml |   2 +-
 .github/workflows/tests.yml       |   2 +-
 pyproject.toml                    |   3 +-
 src/mpol/fourier.py               | 126 +++++++++++++++++-------------
 src/mpol/utils.py                 |  19 +++--
 8 files changed, 93 insertions(+), 65 deletions(-)

diff --git a/.github/workflows/docs_build.yml b/.github/workflows/docs_build.yml
index 7bfef484..c5990b19 100644
--- a/.github/workflows/docs_build.yml
+++ b/.github/workflows/docs_build.yml
@@ -20,7 +20,7 @@ jobs:
     strategy:
       max-parallel: 4
       matrix:
-        python-version: ["3.8"]
+        python-version: ["3.10"]
     steps:
       - uses: actions/checkout@v3
       - name: Set up Python
diff --git a/.github/workflows/gh_docs.yml b/.github/workflows/gh_docs.yml
index 9dac8dd1..71b9a8f5 100644
--- a/.github/workflows/gh_docs.yml
+++ b/.github/workflows/gh_docs.yml
@@ -13,7 +13,7 @@ jobs:
       - name: Set up Python
         uses: actions/setup-python@v4
         with:
-          python-version: '3.8'
+          python-version: '3.10'
       - name: Install doc deps
         run: |
           pip install .'[dev]'
diff --git a/.github/workflows/package.yml b/.github/workflows/package.yml
index 981243dc..0b7b9fea 100644
--- a/.github/workflows/package.yml
+++ b/.github/workflows/package.yml
@@ -14,7 +14,7 @@ jobs:
       - name: Set up Python
         uses: actions/setup-python@v4
         with:
-          python-version: 3.8
+          python-version: 3.10
       - name: Install dependencies
         run: |
           pip install --upgrade pip
diff --git a/.github/workflows/pre-release.yml b/.github/workflows/pre-release.yml
index 7ac21d18..ed50359a 100644
--- a/.github/workflows/pre-release.yml
+++ b/.github/workflows/pre-release.yml
@@ -37,7 +37,7 @@ jobs:
       fail-fast: false
       max-parallel: 4
       matrix:
-        python-version: ["3.8", "3.9", "3.10"]
+        python-version: ["3.10", "3.11", "3.12"]
         os: [ubuntu-20.04, macOS-latest, windows-latest]
     steps:
       - uses: actions/checkout@v3
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
index b7929902..88070c1a 100644
--- a/.github/workflows/tests.yml
+++ b/.github/workflows/tests.yml
@@ -40,7 +40,7 @@ jobs:
     strategy:
       max-parallel: 4
       matrix:
-        python-version: ["3.8", "3.9", "3.10"]
+        python-version: ["3.10", "3.11", "3.12"]
     steps:
       - uses: actions/checkout@v3
       - name: Set up Python ${{ matrix.python-version }}
diff --git a/pyproject.toml b/pyproject.toml
index 637b0b5f..8a62a9b7 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -37,7 +37,7 @@ dev = [
     "frank>=1.2.1",
     "sphinx>=5.3.0",
     "jupytext",
-    "ipython!=8.7.0",               # broken version for syntax higlight https://github.com/spatialaudio/nbsphinx/issues/687
+    "ipython!=8.7.0", # broken version for syntax higlight https://github.com/spatialaudio/nbsphinx/issues/687
     "nbsphinx",
     "sphinx_book_theme>=0.9.3",
     "sphinx_copybutton",
@@ -93,6 +93,7 @@ ignore_missing_imports = true
 module = [
     "MPol.constants",
     "MPoL.losses",
+    # "MPoL.fourier", # once we remove get_vis_residuals
     "MPoL.datasets"
 ]
 disallow_untyped_defs = true
\ No newline at end of file
diff --git a/src/mpol/fourier.py b/src/mpol/fourier.py
index 4d9a41c0..5ce6b4cf 100644
--- a/src/mpol/fourier.py
+++ b/src/mpol/fourier.py
@@ -32,12 +32,12 @@ class FourierCube(nn.Module):
 
     Parameters
     ----------
-    coords : :class:`~mpol.coordinates.GridCoords` 
+    coords : :class:`~mpol.coordinates.GridCoords`
         object containing image dimensions
     persistent_vis : bool
-        should the visibility cube be stored as part of 
-        the module  s `state_dict`? If `True`, the state of the UV grid will be 
-        stored. It is recommended to use `False` for most applications, since the 
+        should the visibility cube be stored as part of
+        the module  s `state_dict`? If `True`, the state of the UV grid will be
+        stored. It is recommended to use `False` for most applications, since the
         visibility cube will rarely be a direct parameter of the model.
 
     """
@@ -58,16 +58,21 @@ def from_image_properties(
         Alternative method for instantiating a FourierCube from ``cell_size`` and
         ``npix``
 
-        Args:
-            cell_size (float): the width of an image-plane pixel [arcseconds]
-            npix (int): the number of pixels per image side
-            persistent_vis (Boolean): should the visibility cube be stored as part of
-                the modules `state_dict`? If `True`, the state of the UV grid will be
-                stored. It is recommended to use `False` for most applications, since
-                the visibility cube will rarely be a direct parameter of the model.
-
-        Returns:
-            instantiated :class:`mpol.fourier.FourierCube` object.
+        Parameters
+        ----------
+        cell_size : float
+            the width of an image-plane pixel [arcseconds]
+        npix : int)
+            the number of pixels per image side
+        persistent_vis : bool
+            should the visibility cube be stored as part of
+            the modules `state_dict`? If `True`, the state of the UV grid will be
+            stored. It is recommended to use `False` for most applications, since
+            the visibility cube will rarely be a direct parameter of the model.
+
+        Returns
+        -------
+        :class:`mpol.fourier.FourierCube`
         """
         coords = GridCoords(cell_size, npix)
         return cls(coords, persistent_vis)
@@ -76,14 +81,16 @@ def forward(self, cube: torch.Tensor) -> torch.Tensor:
         """
         Perform the FFT of the image cube on each channel.
 
-        Args:
-            cube (torch.double tensor): a prepacked tensor of shape 
-                ``(nchan, npix, npix)``. For example, an image cube 
-                from ImageCube.forward()
+        Parameters
+        ----------
+        cube : :class:`torch.Tensor` of :class:`torch.double` of shape ``(nchan, npix, npix)``
+            A 'packed' tensor. For example, an image cube from
+            :meth:`mpol.images.ImageCube.forward`
 
-        Returns:
-            torch.complex tensor, of shape ``(nchan, npix, npix)``. The FFT of the
-                image cube, in packed format.
+        Returns
+        -------
+        :class:`torch.Tensor` of :class:`torch.double` of shape ``(nchan, npix, npix)``.
+            The FFT of the image cube, in packed format.
         """
 
         # make sure the cube is 3D
@@ -102,9 +109,11 @@ def ground_vis(self) -> torch.Tensor:
         The visibility cube in ground format cube fftshifted for plotting with
         ``imshow``.
 
-        Returns:
-            (torch.complex tensor, of shape ``(nchan, npix, npix)``): the FFT of the
-                image cube, in sky plane format.
+        Returns
+        -------
+        :class:`torch.Tensor` of :class:`torch.complex128` of shape ``(nchan, npix, npix)``
+            complex-valued FFT of the image cube (i.e., the visibility cube), in 
+            'ground' format.
         """
 
         return utils.packed_cube_to_ground_cube(self.vis)
@@ -115,8 +124,10 @@ def ground_amp(self) -> torch.Tensor:
         The amplitude of the cube, arranged in unpacked format corresponding to the FFT
         of the sky_cube. Array dimensions for plotting given by ``self.coords.vis_ext``.
 
-        Returns:
-            torch.double : 3D amplitude cube of shape ``(nchan, npix, npix)``
+        Returns
+        -------
+        :class:`torch.Tensor` of :class:`torch.double` of shape ``(nchan, npix, npix)``
+            amplitude cube in 'ground' format.
         """
         return torch.abs(self.ground_vis)
 
@@ -126,8 +137,10 @@ def ground_phase(self) -> torch.Tensor:
         The phase of the cube, arranged in unpacked format corresponding to the FFT of
         the sky_cube. Array dimensions for plotting given by ``self.coords.vis_ext``.
 
-        Returns:
-            torch.double : 3D phase cube of shape ``(nchan, npix, npix)``
+        Returns
+        -------
+        :class:`torch.Tensor` of :class:`torch.double` of shape ``(nchan, npix, npix)``
+            phase cube in 'ground' format (:math:`[-\pi,\pi)`).
         """
         return torch.angle(self.ground_vis)
 
@@ -161,22 +174,29 @@ def safe_baseline_constant_meters(
     If this function returns ``True``, then it would be safe to proceed with
     parallelization in the :class:`mpol.fourier.NuFFT` layer via the coil dimension.
 
-    Args:
-        uu (1D np.array): a 1D array of length ``nvis`` array of the u (East-West)
-            spatial frequency coordinate in units of [m]
-        vv (1D np.array): a 1D array of length ``nvis`` array of the v (North-South)
-            spatial frequency coordinate in units of [m]
-        freqs (1D np.array): a 1D array of length ``nchan`` of the channel frequencies,
-            in units of [Hz].
-        coords: a :class:`mpol.coordinates.GridCoords` object which represents the image
-            and uv-grid dimensions.
-        uv_cell_frac (float): the maximum threshold for a change in :math:`u` or
-            :math:`v` spatial frequency across channels, measured as a fraction of the
-            :math:`u,v` cell defined by ``coords``.
+    Parameters
+    ----------
+    uu : 1D np.array 
+        a 1D array of length ``nvis`` array of the u (East-West)
+        spatial frequency coordinate in units of [m]
+    vv : 1D np.array
+        a 1D array of length ``nvis`` array of the v (North-South)
+        spatial frequency coordinate in units of [m]
+    freqs : 1D np.array
+        a 1D array of length ``nchan`` of the channel frequencies,
+        in units of [Hz].
+    coords: :class:`mpol.coordinates.GridCoords` 
+        object which represents the image and uv-grid dimensions.
+    uv_cell_frac : float
+        the maximum threshold for a change in :math:`u` or
+        :math:`v` spatial frequency across channels, measured as a fraction of the
+        :math:`u,v` cell defined by ``coords``.
 
-    Returns:
-        boolean: `True` if it is safe to assume that the baselines are constant with
-            channel (at a tolerance of ``uv_cell_frac``.) Otherwise returns `False`.
+    Returns
+    -------
+    boolean
+        `True` if it is safe to assume that the baselines are constant with
+        channel (at a tolerance of ``uv_cell_frac``.) Otherwise returns `False`.
     """
 
     # broadcast and convert baselines to kilolambda across channel
@@ -378,13 +398,13 @@ def _assemble_ktraj(self, uu: torch.Tensor, vv: torch.Tensor) -> torch.Tensor:
         vector will influence how TorchKbNufft will perform the operations.
 
         * If ``uu`` and ``vv`` have a 1D shape of (``nvis``), then it will be assumed
-            that the spatial frequencies can be treated as constant with channel. This 
+            that the spatial frequencies can be treated as constant with channel. This
             will result in a ``k_traj`` vector that has shape (``2, nvis``), such that
             parallelization will be across the image cube ``nchan`` dimension using the
             'coil' dimension of the TorchKbNufft package.
         * If the ``uu`` and ``vv`` have a 2D shape of (``nchan, nvis``), then it will
-            be assumed that the spatial frequencies are different for each channel, and 
-            the spatial frequencies provided for each channel will be used. This will 
+            be assumed that the spatial frequencies are different for each channel, and
+            the spatial frequencies provided for each channel will be used. This will
             result in a ``k_traj`` vector that has shape (``nchan, 2, nvis``), such that
             parallelization will be across the image cube ``nchan`` dimension using the
             'batch' dimension of the TorchKbNufft package.
@@ -499,8 +519,8 @@ def forward(
         constant based upon the dimensionality of the ``uu`` and ``vv`` input arguments.
 
         * If ``uu`` and ``vv`` have a shape of (``nvis``), then it will be assumed that
-            the spatial frequencies can be treated as constant with channel (and will 
-            invoke parallelization across the image cube ``nchan`` dimension using the 
+            the spatial frequencies can be treated as constant with channel (and will
+            invoke parallelization across the image cube ``nchan`` dimension using the
             'coil' dimension of the TorchKbNufft package).
         * If the ``uu`` and ``vv`` have a shape of (``nchan, nvis``), then it will be
             assumed that the spatial frequencies are different for each channel, and the
@@ -649,12 +669,12 @@ class NuFFTCached(NuFFT):
 
     * If ``uu`` and ``vv`` have a shape of (``nvis``), then it will be assumed that the
         spatial frequencies can be treated as constant with channel (and will invoke
-        parallelization across the image cube ``nchan`` dimension using the 'coil' 
+        parallelization across the image cube ``nchan`` dimension using the 'coil'
         dimension of the TorchKbNufft package).
     * If the ``uu`` and ``vv`` have a shape of (``nchan, nvis``), then it will be
-        assumed that the spatial frequencies are different for each channel, and the 
-        spatial frequencies provided for each channel will be used (and will invoke 
-        parallelization across the image cube ``nchan`` dimension using the 'batch' 
+        assumed that the spatial frequencies are different for each channel, and the
+        spatial frequencies provided for each channel will be used (and will invoke
+        parallelization across the image cube ``nchan`` dimension using the 'batch'
         dimension of the TorchKbNufft package).
 
     Note that there is no straightforward, computationally efficient way to proceed if
@@ -663,7 +683,7 @@ class NuFFTCached(NuFFT):
     (``nchan, nvis``), such that all channels are padded with bogus :math:`u,v` points
     to have the same length ``nvis``, and you create a boolean mask to keep track of
     which points are valid. Then, when this routine returns data points of shape
-    (``nchan, nvis``), you can use that boolean mask to select only the valid 
+    (``nchan, nvis``), you can use that boolean mask to select only the valid
     :math:`u,v` points.
 
     **Interpolation mode**: You may choose the type of interpolation mode that KbNufft
diff --git a/src/mpol/utils.py b/src/mpol/utils.py
index 6f24264d..95906f93 100644
--- a/src/mpol/utils.py
+++ b/src/mpol/utils.py
@@ -500,26 +500,30 @@ def fourier_gaussian_lambda_radians(u, v, a, delta_l, delta_m, sigma_l, sigma_m,
 
     .. math::
 
-        f_1(l, m) = a \exp \left(-\frac{1}{2} \left [\left(\frac{l}{\sigma_l}\right)^2 + \left( \frac{m}{\sigma_m} \right)^2 \right] \right).
+        f_1(l, m) = a \exp \left(-\frac{1}{2} \left [\left(\frac{l}{\sigma_l}\right)^2 +
+        \left( \frac{m}{\sigma_m} \right)^2 \right] \right).
 
     i.e., something we might call a normalized Gaussian function. Phrased in terms of 
     :math:`f_0`, :math:`f_1` is
 
     .. math::
 
-        f_1(l, m) = f_0\left ( \frac{l}{\sigma_l \sqrt{2 \pi}},\, \frac{m}{\sigma_m \sqrt{2 \pi}}\right).
+        f_1(l, m) = f_0\left ( \frac{l}{\sigma_l \sqrt{2 \pi}},\, 
+        \frac{m}{\sigma_m \sqrt{2 \pi}}\right).
 
     Therefore, according to the similarity theorem, the equivalent :math:`F_1(u,v)` is
 
     .. math::
 
-        F_1(u, v) = \sigma_l \sigma_m 2 \pi F_0 \left( \sigma_l \sqrt{2 \pi} u,\, \sigma_m \sqrt{2 \pi} v \right),
+        F_1(u, v) = \sigma_l \sigma_m 2 \pi F_0 \left( \sigma_l \sqrt{2 \pi} u,\, 
+        \sigma_m \sqrt{2 \pi} v \right),
 
     or
 
     .. math::
 
-        F_1(u, v) = a \sigma_l \sigma_m 2 \pi \exp \left ( -2 \pi^2 [\sigma_l^2 u^2 + \sigma_m^2 v^2] \right).
+        F_1(u, v) = a \sigma_l \sigma_m 2 \pi \exp \left ( -2 \pi^2 [\sigma_l^2 u^2 + 
+        \sigma_m^2 v^2] \right).
 
     Next, we rotate the Gaussian to match the sky plane rotation. A rotation 
     :math:`\Omega` in the sky plane is carried out in the same direction in the 
@@ -551,11 +555,14 @@ def fourier_gaussian_lambda_radians(u, v, a, delta_l, delta_m, sigma_l, sigma_m,
 
         F_3(u,v) = \exp\left (- 2 i \pi [\delta_l u + \delta_m v] \right) F_2(u,v)
 
-    We have arrived at the corresponding Fourier Gaussian, :math:`F_\mathrm{g}(u,v) = F_3(u,v)`. The simplified equation is
+    We have arrived at the corresponding Fourier Gaussian, :math:`F_\mathrm{g}(u,v) = 
+    F_3(u,v)`. The simplified equation is
 
     .. math::
 
-        F_\mathrm{g}(u,v) = a \sigma_l \sigma_m 2 \pi \exp \left ( -2 \pi^2 \left [\sigma_l^2 u'^2 + \sigma_m^2 v'^2 \right]  - 2 i \pi \left [\delta_l u + \delta_m v \right] \right).
+        F_\mathrm{g}(u,v) = a \sigma_l \sigma_m 2 \pi \exp \left ( 
+        -2 \pi^2 \left [\sigma_l^2 u'^2 + \sigma_m^2 v'^2 \right]  
+        - 2 i \pi \left [\delta_l u + \delta_m v \right] \right).
 
     N.B. that we have mixed primed (:math:`u'`) and unprimed (:math:`u`) coordinates in 
     the same equation for brevity.

From 1a1babf7934ee30ab110e01446145d75b80792e0 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 21:01:45 +0000
Subject: [PATCH 13/26] update package requires to be >= 3.10 python

---
 pyproject.toml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pyproject.toml b/pyproject.toml
index 8a62a9b7..9b7694f4 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -23,7 +23,7 @@ description = "Regularized Maximum Likelihood Imaging for Radio Astronomy"
 dynamic = ["version"]
 name = "MPoL"
 readme = "README.md"
-requires-python = ">3.8"
+requires-python = ">=3.10"
 
 [project.optional-dependencies]
 dev = [

From 275ee66f71446a8c0d3e85f7670b67bb5efce299 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 22:05:15 +0000
Subject: [PATCH 14/26] progress on typing utils.

---
 docs/changelog.md |   3 +-
 pyproject.toml    |   1 +
 src/mpol/utils.py | 238 +++++++++++++++++++++++++++++-----------------
 3 files changed, 155 insertions(+), 87 deletions(-)

diff --git a/docs/changelog.md b/docs/changelog.md
index bbfcfef1..17f867f5 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -5,7 +5,8 @@
 ## v0.2.1
 
 - *Placeholder* Planned changes described by Architecture GitHub Project.
-- Implemented type checking for more of the codebase
+- Removed unused routine `mpol.utils.log_stretch`.
+- Added type hints for core modules ([#54](https://github.com/MPoL-dev/MPoL/issues/54)). This should improve stability of core routines and help users when writing code using MPoL in an IDE.
 - Manually line wrapped many docstrings to conform to 88 characters per line or less. Ian thought `black` would do this by default, but actually that [doesn't seem to be the case](https://github.com/psf/black/issues/2865).
 - Fully leaned into the `pyproject.toml` setup to modernize build via [hatch](https://github.com/pypa/hatch). This centralizes the project dependencies and derives package versioning directly from git tags. Intermediate packages built from commits after the latest tag (e.g., `0.2.0`) will have an extra long string, e.g., `0.2.1.dev178+g16cfc3e.d20231223` where the version is a guess at the next version and the hash gives reference to the commit. This means that developers bump versions entirely by tagging a new version with git (or more likely by drafting a new release on the [GitHub release page](https://github.com/MPoL-dev/MPoL/releases)).
 - Removed `setup.py`.
diff --git a/pyproject.toml b/pyproject.toml
index 9b7694f4..af5286e4 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -94,6 +94,7 @@ module = [
     "MPol.constants",
     "MPoL.losses",
     # "MPoL.fourier", # once we remove get_vis_residuals
+    # "MPoL.utils", # once we fix check_baselines
     "MPoL.datasets"
 ]
 disallow_untyped_defs = true
\ No newline at end of file
diff --git a/src/mpol/utils.py b/src/mpol/utils.py
index 95906f93..de5f475a 100644
--- a/src/mpol/utils.py
+++ b/src/mpol/utils.py
@@ -3,82 +3,99 @@
 import numpy as np
 import torch
 
-from .constants import arcsec, c_ms, cc, deg, kB
+from typing import Any
+import numpy.typing as npt
+from mpol.constants import arcsec, c_ms, cc, deg, kB
 
-def torch2npy(tensor):
-    """Make a copy of a PyTorch tensor on the CPU in numpy format, e.g. for plotting"""
-    return tensor.detach().cpu().numpy()
 
+def torch2npy(t: torch.Tensor) -> npt.NDArray:
+    """
+    Copy a tensor (potentially on the GPU) to the CPU and convert to a numpy
+    :class:`np.ndarray`, e.g., for visualization or further analysis with non-PyTorch
+    scientific libraries.
+
+    Parameters
+    ----------
+    t : torch.Tensor
+
+    Returns
+    _______
+    np.ndarray
+
+    """
+    t_cpu: torch.Tensor = t.detach().cpu()
+    t_np: np.ndarray = t_cpu.numpy()
+    return t_np
 
-def ground_cube_to_packed_cube(ground_cube):
+
+def ground_cube_to_packed_cube(ground_cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Converts a Ground Cube to a Packed Visibility Cube for visibility-plane work. 
+    Converts a Ground Cube to a Packed Visibility Cube for visibility-plane work.
     See Units and Conventions for more details.
 
     Args:
-        ground_cube: a previously initialized Ground Cube object (cube (3D torch tensor 
+        ground_cube: a previously initialized Ground Cube object (cube (3D torch tensor
         of shape ``(nchan, npix, npix)``))
 
     Returns:
-        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting 
-            array after applying ``torch.fft.fftshift`` to the input arg; i.e Returns a 
+        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting
+            array after applying ``torch.fft.fftshift`` to the input arg; i.e Returns a
             Packed Visibility Cube.
     """
-    shifted = torch.fft.fftshift(ground_cube, dim=(1, 2))
+    shifted: torch.Tensor = torch.fft.fftshift(ground_cube, dim=(1, 2))
     return shifted
 
 
-def packed_cube_to_ground_cube(packed_cube) -> torch.Tensor:
+def packed_cube_to_ground_cube(packed_cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Converts a Packed Visibility Cube to a Ground Cube for visibility-plane work. See 
+    Converts a Packed Visibility Cube to a Ground Cube for visibility-plane work. See
     Units and Conventions for more details.
 
     Args:
-        packed_cube: a previously initialized Packed Cube object (cube (3D torch tensor 
+        packed_cube: a previously initialized Packed Cube object (cube (3D torch tensor
         of shape ``(nchan, npix, npix)``))
 
     Returns:
-        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting 
-            array after applying ``torch.fft.fftshift`` to the input arg; i.e Returns a 
+        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting
+            array after applying ``torch.fft.fftshift`` to the input arg; i.e Returns a
             Ground Cube.
     """
     # fftshift the image cube to the correct quadrants
-    shifted: torch.Tensor
-    shifted = torch.fft.fftshift(packed_cube, dim=(1, 2))
+    shifted: torch.Tensor = torch.fft.fftshift(packed_cube, dim=(1, 2))
     return shifted
 
 
-def sky_cube_to_packed_cube(sky_cube):
+def sky_cube_to_packed_cube(sky_cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Converts a Sky Cube to a Packed Image Cube for image-plane work. See Units and 
+    Converts a Sky Cube to a Packed Image Cube for image-plane work. See Units and
     Conventions for more details.
 
     Args:
-        sky_cube: a previously initialized Sky Cube object with RA increasing to the 
+        sky_cube: a previously initialized Sky Cube object with RA increasing to the
             *left* (cube (3D torch tensor of shape ``(nchan, npix, npix)``))
 
     Returns:
-        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting 
-            array after applying ``torch.fft.fftshift`` to the ``torch.flip()`` of the 
+        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting
+            array after applying ``torch.fft.fftshift`` to the ``torch.flip()`` of the
             RA axis; i.e Returns a Packed Image Cube.
     """
     flipped = torch.flip(sky_cube, (2,))
-    shifted = torch.fft.fftshift(flipped, dim=(1, 2))
+    shifted: torch.Tensor = torch.fft.fftshift(flipped, dim=(1, 2))
     return shifted
 
 
-def packed_cube_to_sky_cube(packed_cube):
+def packed_cube_to_sky_cube(packed_cube: torch.Tensor) -> torch.Tensor:
     r"""
-    Converts a Packed Image Cube to a Sky Cube for image-plane work. See Units and 
+    Converts a Packed Image Cube to a Sky Cube for image-plane work. See Units and
     Conventions for more details.
 
     Args:
-        packed_cube: a previously initialized Packed Image Cube object (cube (3D torch 
+        packed_cube: a previously initialized Packed Image Cube object (cube (3D torch
         tensor of shape ``(nchan, npix, npix)``))
 
     Returns:
-        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting 
-            array after applying ``torch.fft.fftshift`` to the ``torch.flip()`` of the 
+        torch.double : 3D image cube of shape ``(nchan, npix, npix)``; The resulting
+            array after applying ``torch.fft.fftshift`` to the ``torch.flip()`` of the
             RA axis; i.e Returns a Sky Cube.
     """
     # fftshift the image cube to the correct quadrants
@@ -88,9 +105,9 @@ def packed_cube_to_sky_cube(packed_cube):
     return flipped
 
 
-def get_Jy_arcsec2(T_b, nu=230e9):
+def get_Jy_arcsec2(T_b: float, nu: float = 230e9) -> float:
     r"""
-    Calculate specific intensity from the brightness temperature, using the 
+    Calculate specific intensity from the brightness temperature, using the
     Rayleigh-Jeans definition.
 
     Args:
@@ -113,31 +130,31 @@ def get_Jy_arcsec2(T_b, nu=230e9):
     return Jy_arcsec2
 
 
-def log_stretch(x):
+def loglinspace(
+    start: float, end: float, N_log: int, M_linear: int = 3
+) -> npt.NDArray[np.floating[Any]]:
     r"""
-    Apply a log stretch to the tensor.
-
-    Args:
-        tensor (PyTorch tensor): input tensor :math:`x`
-
-    Returns: :math:`\ln(1 + |x|)`
-    """
-
-    return torch.log(1 + torch.abs(x))
+    Return a logspaced array of bin edges, with the first ``M_linear`` cells being
+    equal width. There is a one-cell overlap between the linear and logarithmic
+    stretches of the array, since the last linear cell is also the first logarithmic
+    cell, which means the total number of cells is ``M_linear + N_log - 1``.
 
+    Parameters
+    ----------
+    start : float
+        starting cell left edge
+    end : float
+        ending cell right edge
+    N_log : int
+        number of logarithmically spaced bins
+    M_linear : int
+        number of linearly (equally) spaced bins
 
-def loglinspace(start, end, N_log, M_linear=3):
-    r"""
-    Return a logspaced array of bin edges, with the first ``M_linear`` cells being 
-    equal width. There is a one-cell overlap between the linear and logarithmic 
-    stretches of the array, since the last linear cell is also the first logarithmic 
-    cell, which means the total number of cells is ``M_linear + N_log - 1``.
+    Returns
+    -------
+    np.ndarray
+        logspaced bin edges
 
-    Args:
-        start (float): starting cell left edge
-        end (float): ending cell right edge
-        N_log (int): number of logarithmically spaced bins
-        M_linear (int): number of linearly (equally) spaced bins
     """
 
     # transition cell left edge
@@ -157,9 +174,9 @@ def loglinspace(start, end, N_log, M_linear=3):
     return np.array(cell_walls)
 
 
-def fftspace(width, N):
-    """Delivers a (nearly) symmetric coordinate array that spans :math:`N` elements 
-    (where :math:`N` is even) from `-width` to `+width`, but ensures that the middle 
+def fftspace(width: float, N: int) -> npt.NDArray[np.floating[Any]]:
+    """Delivers a (nearly) symmetric coordinate array that spans :math:`N` elements
+    (where :math:`N` is even) from `-width` to `+width`, but ensures that the middle
     point lands on :math:`0`. The array indices go from :math:`0` to :math:`N -1.`
 
     Args:
@@ -228,7 +245,7 @@ def convert_baselines(baselines, freq=None, wle=None):
     Returns:
         (1D array nvis): baselines in [klambda]
     Notes:
-        If ``baselines``, ``freq`` or ``wle`` are numpy arrays, their shapes must be 
+        If ``baselines``, ``freq`` or ``wle`` are numpy arrays, their shapes must be
         broadcast-able.
     """
     if (freq is None and wle is None) or (wle and freq):
@@ -271,9 +288,9 @@ def broadcast_and_convert_baselines(u, v, chan_freq):
     return (uu, vv)
 
 
-def get_max_spatial_freq(cell_size, npix):
+def get_max_spatial_freq(cell_size: float, npix: int) -> float:
     r"""
-    Calculate the maximum spatial frequency that the image can represent and still 
+    Calculate the maximum spatial frequency that the image can represent and still
     satisfy the Nyquist Sampling theorem.
 
     Args:
@@ -291,13 +308,13 @@ def get_max_spatial_freq(cell_size, npix):
     return (npix / 2 - 1) / (npix * cell_size * arcsec) * 1e-3  # kilolambda
 
 
-def get_maximum_cell_size(uu_vv_point):
+def get_maximum_cell_size(uu_vv_point: float) -> float:
     r"""
-    Calculate the maximum possible cell_size that will still Nyquist sample the uu or 
+    Calculate the maximum possible cell_size that will still Nyquist sample the uu or
     vv point. Note: not q point.
 
     Args:
-        uu_vv_point (float): a single spatial frequency. Units of 
+        uu_vv_point (float): a single spatial frequency. Units of
             [:math:`\mathrm{k}\lambda`].
 
     Returns:
@@ -307,10 +324,14 @@ def get_maximum_cell_size(uu_vv_point):
     return 1 / ((2 - 1) * uu_vv_point * 1e3) / arcsec
 
 
-def get_optimal_image_properties(image_width, u, v):
+def get_optimal_image_properties(
+    image_width: float,
+    u: npt.NDArray[np.floating[Any]],
+    v: npt.NDArray[np.floating[Any]],
+) -> tuple[float, int]:
     r"""
     For an image of desired width, determine the maximum pixel size that
-    ensures Nyquist sampling of the provided spatial frequency points, and the 
+    ensures Nyquist sampling of the provided spatial frequency points, and the
     corresponding number of pixels to obtain the desired image width.
 
     Parameters
@@ -318,8 +339,8 @@ def get_optimal_image_properties(image_width, u, v):
     image_width : float, unit = arcsec
         Desired width of the image (for a square image of size
         `image_width` :math:`\times` `image_width`).
-    u, v : float or array, unit = :math:`k\lambda`
-        `u` and `v` spatial frequency points. 
+    u, v : np.ndarray of np.float, unit = :math:`k\lambda`
+        `u` and `v` baselines.
 
     Returns
     -------
@@ -330,8 +351,7 @@ def get_optimal_image_properties(image_width, u, v):
         width (npix will be rounded up and enforced as even).
     Notes
     -----
-    No assumption or correction is made concerning whether the spatial 
-    frequency points are projected or deprojected.
+    Assumes baselines are as-observed.
     """
     max_freq = max(max(abs(u)), max(abs(v)))
 
@@ -341,21 +361,34 @@ def get_optimal_image_properties(image_width, u, v):
     npix = math.ceil(image_width / cell_size)
 
     # account for Nyquist of proposed cell_size, npix
-    cell_size *= cell_size / get_maximum_cell_size(get_max_spatial_freq(cell_size, npix))
+    cell_size *= cell_size / get_maximum_cell_size(
+        get_max_spatial_freq(cell_size, npix)
+    )
 
     npix = math.ceil(image_width / cell_size)
-    
+
     # enforce that npix be even
     if npix % 2 == 1:
         npix += 1
 
-    # should never occur 
-    assert(get_max_spatial_freq(cell_size, npix) >= max_freq), "error in get_optimal_image_properties"
+    # should never occur
+    assert (
+        get_max_spatial_freq(cell_size, npix) >= max_freq
+    ), "error in get_optimal_image_properties"
 
     return cell_size, npix
 
 
-def sky_gaussian_radians(l, m, a, delta_l, delta_m, sigma_l, sigma_m, Omega):
+def sky_gaussian_radians(
+    l: npt.NDArray[np.floating[Any]],
+    m: npt.NDArray[np.floating[Any]],
+    a: float,
+    delta_l: float,
+    delta_m: float,
+    sigma_l: float,
+    sigma_m: float,
+    Omega: float,
+) -> npt.NDArray[np.floating[Any]]:
     r"""
     Calculates a 2D Gaussian on the sky plane with inputs in radians. The Gaussian is 
     centered at ``delta_l, delta_m``, has widths of ``sigma_l, sigma_m``, and is 
@@ -379,7 +412,9 @@ def sky_gaussian_radians(l, m, a, delta_l, delta_m, sigma_l, sigma_m, Omega):
 
     .. math::
 
-        f_\mathrm{g}(l,m) = a \exp \left ( - \frac{1}{2} \left [ \left (\frac{l''}{\sigma_l} \right)^2 + \left( \frac{m''}{\sigma_m} \right )^2 \right ] \right )
+        f_\mathrm{g}(l,m) = a \exp \left ( - \frac{1}{2} \left 
+        [ \left (\frac{l''}{\sigma_l} \right)^2 + \left( \frac{m''}{\sigma_m} 
+        \right )^2 \right ] \right )
 
     Args:
         l: units of [radians]
@@ -403,13 +438,25 @@ def sky_gaussian_radians(l, m, a, delta_l, delta_m, sigma_l, sigma_m, Omega):
     lp = lt * np.cos(Omega * deg) - mt * np.sin(Omega * deg)
     mp = lt * np.sin(Omega * deg) + mt * np.cos(Omega * deg)
 
-    return a * np.exp(-0.5 * ((lp / sigma_l) ** 2 + (mp / sigma_m) ** 2))
-
-
-def sky_gaussian_arcsec(x, y, a, delta_x, delta_y, sigma_x, sigma_y, Omega):
+    gauss: npt.NDArray[np.floating[Any]] = a * np.exp(
+        -0.5 * ((lp / sigma_l) ** 2 + (mp / sigma_m) ** 2)
+    )
+    return gauss
+
+
+def sky_gaussian_arcsec(
+    x: npt.NDArray[np.floating[Any]],
+    y: npt.NDArray[np.floating[Any]],
+    a: float,
+    delta_x: float,
+    delta_y: float,
+    sigma_x: float,
+    sigma_y: float,
+    Omega: float,
+) -> npt.NDArray[np.floating[Any]]:
     r"""
     Calculates a Gaussian on the sky plane using inputs in arcsec. This is a convenience
-    wrapper to :func:`~mpol.utils.sky_gaussian_radians` that automatically converts 
+    wrapper to :func:`~mpol.utils.sky_gaussian_radians` that automatically converts
     from arcsec to radians.
 
     Args:
@@ -438,7 +485,16 @@ def sky_gaussian_arcsec(x, y, a, delta_x, delta_y, sigma_x, sigma_y, Omega):
     )
 
 
-def fourier_gaussian_lambda_radians(u, v, a, delta_l, delta_m, sigma_l, sigma_m, Omega):
+def fourier_gaussian_lambda_radians(
+    u: npt.NDArray[np.floating[Any]],
+    v: npt.NDArray[np.floating[Any]],
+    a: float,
+    delta_l: float,
+    delta_m: float,
+    sigma_l: float,
+    sigma_m: float,
+    Omega: float,
+) -> npt.NDArray[np.floating[Any]]:
     r"""
     Calculate the Fourier plane Gaussian :math:`F_\mathrm{g}(u,v)` corresponding to the 
     Sky plane Gaussian :math:`f_\mathrm{g}(l,m)` in 
@@ -581,7 +637,7 @@ def fourier_gaussian_lambda_radians(u, v, a, delta_l, delta_m, sigma_l, sigma_m,
     vp = u * np.sin(Omega * deg) + v * np.cos(Omega * deg)
 
     # calculate the Fourier Gaussian
-    return (
+    fgauss: npt.NDArray[np.floating[Any]] = (
         a
         * sigma_l
         * sigma_m
@@ -592,15 +648,25 @@ def fourier_gaussian_lambda_radians(u, v, a, delta_l, delta_m, sigma_l, sigma_m,
             - 2.0j * np.pi * (delta_l * u + delta_m * v)
         )
     )
-
-
-def fourier_gaussian_klambda_arcsec(u, v, a, delta_x, delta_y, sigma_x, sigma_y, Omega):
+    return fgauss
+
+
+def fourier_gaussian_klambda_arcsec(
+    u: npt.NDArray[np.floating[Any]],
+    v: npt.NDArray[np.floating[Any]],
+    a: float,
+    delta_x: float,
+    delta_y: float,
+    sigma_x: float,
+    sigma_y: float,
+    Omega: float,
+) -> npt.NDArray[np.floating[Any]]:
     r"""
-    Calculate the Fourier plane Gaussian :math:`F_\mathrm{g}(u,v)` corresponding to the 
-    Sky plane Gaussian :math:`f_\mathrm{g}(l,m)` in 
+    Calculate the Fourier plane Gaussian :math:`F_\mathrm{g}(u,v)` corresponding to the
+    Sky plane Gaussian :math:`f_\mathrm{g}(l,m)` in
     :func:`~mpol.utils.sky_gaussian_arcsec`, using analytical relationships. The Fourier
-    Gaussian is parameterized using the sky plane centroid (``delta_l, delta_m``), 
-    widths (``sigma_l, sigma_m``) and rotation (``Omega``). Assumes that ``a`` was in 
+    Gaussian is parameterized using the sky plane centroid (``delta_l, delta_m``),
+    widths (``sigma_l, sigma_m``) and rotation (``Omega``). Assumes that ``a`` was in
     units of :math:`\mathrm{Jy}/\mathrm{arcsec}^2`.
 
     Args:

From 6be32072b209f7983fcfdaae801df4ade2da9b9b Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 22:11:04 +0000
Subject: [PATCH 15/26] started on gridding types

---
 src/mpol/gridding.py |  3 +--
 test/GPU_SLURM.sh    | 20 --------------------
 2 files changed, 1 insertion(+), 22 deletions(-)
 delete mode 100644 test/GPU_SLURM.sh

diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index 65c55223..e4265378 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -11,8 +11,7 @@
 
 from mpol.coordinates import GridCoords
 from mpol.exceptions import DataError, ThresholdExceededError, WrongDimensionError
-
-from .datasets import GriddedDataset
+from mpol.datasets import GriddedDataset
 
 
 def _check_data_inputs_2d(uu=None, vv=None, weight=None, data_re=None, data_im=None):
diff --git a/test/GPU_SLURM.sh b/test/GPU_SLURM.sh
deleted file mode 100644
index b86092a4..00000000
--- a/test/GPU_SLURM.sh
+++ /dev/null
@@ -1,20 +0,0 @@
-#!/bin/bash 
-
-#SBATCH --nodes=1 
-#SBATCH --ntasks=1 
-#SBATCH --gpus=1
-#SBATCH --mem=64GB 
-#SBATCH --time=1:00:00 
-#SBATCH --account=ipc5094_c_gpu
-#SBATCH --partition=sla-prio
-
-# load appropriate modules
-module purge
-module load ffmpeg/4.3.2
-module load openmpi/4.1.1
-module load anaconda3/2021.05
-
-# load the virtual environment
-# source /storage/home/ipc5094/Documents/scripts/RML_init.sh
-
-pytest
\ No newline at end of file

From d57eafd1d46f2d8a886534476673b74032959153 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 22:31:22 +0000
Subject: [PATCH 16/26] started typing.

---
 src/mpol/gridding.py | 381 ++++++++++++++++++++++---------------------
 1 file changed, 195 insertions(+), 186 deletions(-)

diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index e4265378..d99c33f5 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -6,7 +6,10 @@
 
 import warnings
 
+from typing import Any
+
 import numpy as np
+import numpy.typing as npt
 from fast_histogram import histogram as fast_hist
 
 from mpol.coordinates import GridCoords
@@ -14,14 +17,20 @@
 from mpol.datasets import GriddedDataset
 
 
-def _check_data_inputs_2d(uu=None, vv=None, weight=None, data_re=None, data_im=None):
+def _check_data_inputs_2d(
+    uu=npt.NDArray[np.floating[Any]] | None,
+    vv=npt.NDArray[np.floating[Any]] | None,
+    weight=npt.NDArray[np.floating[Any]] | None,
+    data_re=npt.NDArray[np.floating[Any]] | None,
+    data_im=npt.NDArray[np.floating[Any]] | None,
+):
     """
-    Check that all data inputs are the same shape, the weights are positive, and the 
+    Check that all data inputs are the same shape, the weights are positive, and the
     data_re and data_im are floats.
 
     Make a reasonable effort to ensure that Hermitian pairs are *not* included.
 
-    If the user supplied 1d vectors of shape ``(nvis,)``, make them all 2d with one 
+    If the user supplied 1d vectors of shape ``(nvis,)``, make them all 2d with one
     channel, ``(1,nvis)``.
 
     """
@@ -60,34 +69,34 @@ def _check_data_inputs_2d(uu=None, vv=None, weight=None, data_re=None, data_im=N
 
 def verify_no_hermitian_pairs(uu, vv, data, test_vis=5, test_channel=0):
     r"""
-    Check that the dataset does not contain Hermitian pairs. Because the sky brightness 
-    :math:`I_\nu` is real, the visibility function :math:`\mathcal{V}` is Hermitian, 
+    Check that the dataset does not contain Hermitian pairs. Because the sky brightness
+    :math:`I_\nu` is real, the visibility function :math:`\mathcal{V}` is Hermitian,
     meaning that
 
     .. math::
 
         \mathcal{V}(u, v) = \mathcal{V}^*(-u, -v).
 
-    Most datasets (e.g., those extracted from CASA) will only record one visibility 
-    measurement per baseline and not include the duplicate Hermitian pair (to save 
-    storage space). This routine attempts to determine if the dataset contains 
-    Hermitian pairs or not by choosing one data point at a time and then searching the 
+    Most datasets (e.g., those extracted from CASA) will only record one visibility
+    measurement per baseline and not include the duplicate Hermitian pair (to save
+    storage space). This routine attempts to determine if the dataset contains
+    Hermitian pairs or not by choosing one data point at a time and then searching the
     dataset to see if its Hermitian pair exists. The routine will declare that a dataset
-    contains Hermitian pairs or not after it has searched ``test_vis`` number of data 
-    points. If 0 Hermitian pairs have been found for all ``test_vis`` points, then 
+    contains Hermitian pairs or not after it has searched ``test_vis`` number of data
+    points. If 0 Hermitian pairs have been found for all ``test_vis`` points, then
     the dataset will be declared to have no Hermitian pairs. If ``test_vis`` Hermitian
-    pairs were found for ``test_vis`` points searched, then the dataset will be 
-    declared to have Hermitian pairs. If more than 0 but fewer than ``test_vis`` 
+    pairs were found for ``test_vis`` points searched, then the dataset will be
+    declared to have Hermitian pairs. If more than 0 but fewer than ``test_vis``
     Hermitian pairs were found for ``test_vis`` points, an error will be raised.
 
-    Gridding objects like :class:`mpol.gridding.DirtyImager` will naturally augment an 
-    input dataset to include the Hermitian pairs, so that images of :math:`I_\nu` 
+    Gridding objects like :class:`mpol.gridding.DirtyImager` will naturally augment an
+    input dataset to include the Hermitian pairs, so that images of :math:`I_\nu`
     produced with the inverse Fourier transform turn out to be real.
 
     Args:
-        uu (numpy array): array of u spatial frequency coordinates. 
+        uu (numpy array): array of u spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        vv (numpy array): array of v spatial frequency coordinates. 
+        vv (numpy array): array of v spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
         data (numpy complex): array of data values
         test_vis (int): the number of data points to search for Hermitian 'matches'
@@ -108,11 +117,11 @@ def verify_no_hermitian_pairs(uu, vv, data, test_vis=5, test_channel=0):
     vv = vv[test_channel]
     data = data[test_channel]
 
-    # if the dataset contains Hermitian pairs, 
+    # if the dataset contains Hermitian pairs,
     # then there will be a large number of visibilities that have matching
     # (uu, vv) and conjugate data values
 
-    # We don't know what order uu or vv might have been augmented in, or sorted 
+    # We don't know what order uu or vv might have been augmented in, or sorted
     # after the fact, so we can't
     # rely on quick differencing operations
 
@@ -160,7 +169,7 @@ def verify_no_hermitian_pairs(uu, vv, data, test_vis=5, test_channel=0):
     # choose a uu, vv point, then see if the opposite value exists in the dataset
     # if it does, then check that its visibility value is the complex conjugate
 
-    # we could have a max threshold, i.e., like at least 5 need to exist to say 
+    # we could have a max threshold, i.e., like at least 5 need to exist to say
     # the dataset has pairs
 
     # Subtract
@@ -173,36 +182,36 @@ class GridderBase:
 
     Subclasses will need to implement a `_grid_visibilities(self,...)` method.
 
-    The GridderBase object uses desired image dimensions (via the ``cell_size`` and 
-    ``npix`` arguments) to define a corresponding Fourier plane grid as a 
-    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in 
+    The GridderBase object uses desired image dimensions (via the ``cell_size`` and
+    ``npix`` arguments) to define a corresponding Fourier plane grid as a
+    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in
     lieu of ``cell_size`` and ``npix``, but all three arguments should never be supplied
-    at once. For more details on the properties of the grid that is created, see the 
+    at once. For more details on the properties of the grid that is created, see the
     :class:`.GridCoords` documentation.
 
-    Subclasses will accept "loose" *ungridded* visibility data and store the arrays to 
-    the object as instance attributes. The input visibility data should be the set of 
-    visibilities over the full :math:`[-u,u]` and :math:`[-v,v]` domain, and should not 
-    contain Hermitian pairs (an error will be raised, if they are encountered).  The 
-    visibilities can be 1d for a single continuum channel, or 2d for an image cube. If 
-    1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``. Like the 
-    :class:`~mpol.images.ImageCube` class, after construction, GridderBase assumes that 
-    you are operating with a multi-channel set of visibilities. These routines will 
-    still work with single-channel 'continuum' visibilities, they will just have 
+    Subclasses will accept "loose" *ungridded* visibility data and store the arrays to
+    the object as instance attributes. The input visibility data should be the set of
+    visibilities over the full :math:`[-u,u]` and :math:`[-v,v]` domain, and should not
+    contain Hermitian pairs (an error will be raised, if they are encountered).  The
+    visibilities can be 1d for a single continuum channel, or 2d for an image cube. If
+    1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``. Like the
+    :class:`~mpol.images.ImageCube` class, after construction, GridderBase assumes that
+    you are operating with a multi-channel set of visibilities. These routines will
+    still work with single-channel 'continuum' visibilities, they will just have
     `nchan = 1` in the first dimension of most products.
 
     Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
-        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates. 
+        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates. 
+        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        weight (2d numpy array): (nchan, nvis) array of thermal weights. 
+        weight (2d numpy array): (nchan, nvis) array of thermal weights.
             Units of [:math:`1/\mathrm{Jy}^2`]
-        data_re (2d numpy array): (nchan, nvis) array of the real part of the 
+        data_re (2d numpy array): (nchan, nvis) array of the real part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
-        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the 
+        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
 
     """
@@ -268,34 +277,34 @@ def _create_cell_indices(self):
 
     def _sum_cell_values_channel(self, uu, vv, values=None):
         r"""
-        Given a list of loose visibility points :math:`(u,v)` and their corresponding 
-        values :math:`x`, partition the points up into 2D :math:`u-v` cells defined by 
-        the ``coords`` object attached to the gridder, such that ``cell[i,j]`` has 
+        Given a list of loose visibility points :math:`(u,v)` and their corresponding
+        values :math:`x`, partition the points up into 2D :math:`u-v` cells defined by
+        the ``coords`` object attached to the gridder, such that ``cell[i,j]`` has
         bounds between ``coords.u_edges[j, j+1]`` and ``coords.v_edges[i, i+1]``.
-        Then, sum the corresponding values for each :math:`(u,v)` point that falls 
+        Then, sum the corresponding values for each :math:`(u,v)` point that falls
         within each cell. The resulting cell value is
 
         .. math::
 
             \mathrm{result}_{i,j} = \sum_k \mathrm{values}_k
 
-        where :math:`k` indexes all :math:`(u,v)` points that fall within 
-        ``coords.u_edges[j, j+1]`` and ``coords.v_edges[i, i+1]``. In the case that all 
-        values are :math:`1`, the result is the number of visibilities within each cell 
+        where :math:`k` indexes all :math:`(u,v)` points that fall within
+        ``coords.u_edges[j, j+1]`` and ``coords.v_edges[i, i+1]``. In the case that all
+        values are :math:`1`, the result is the number of visibilities within each cell
         (i.e., a histogram).
 
         Args:
-            uu (np.array): 1D array of East-West spatial frequency coordinates for a 
+            uu (np.array): 1D array of East-West spatial frequency coordinates for a
                 specific channel. Units of [:math:`\mathrm{k}\lambda`]
-            vv (np.array): 1D array of North-South spatial frequency coordinates for a 
+            vv (np.array): 1D array of North-South spatial frequency coordinates for a
                 specific channel. Units of [:math:`\mathrm{k}\lambda`]
-            values (np.array): 1D array of values (the same length as uu and vv) to use 
-                in the sum over each cell. The default (``values=None``) corresponds to 
-                using ``values=np.ones_like(uu)`` such that the routine is equivalent 
+            values (np.array): 1D array of values (the same length as uu and vv) to use
+                in the sum over each cell. The default (``values=None``) corresponds to
+                using ``values=np.ones_like(uu)`` such that the routine is equivalent
                 to a histogram.
 
         Returns:
-            A 2D array of size ``(npix, npix)`` in ground format containing the summed 
+            A 2D array of size ``(npix, npix)`` in ground format containing the summed
             cell quantities.
         """
 
@@ -315,17 +324,17 @@ def _sum_cell_values_channel(self, uu, vv, values=None):
 
     def _sum_cell_values_cube(self, values=None):
         r"""
-        Perform the :func:`~mpol.gridding.DataAverager.sum_cell_values_channel` routine 
+        Perform the :func:`~mpol.gridding.DataAverager.sum_cell_values_channel` routine
         over all channels of the input visibilities.
 
         Args:
-            values (iterable): ``(nchan, nvis)`` array of values to use in the sum over 
-                each cell. The default (``values=None``) corresponds to using 
-                ``values=np.ones_like(uu)`` such that the routine is equivalent to a 
+            values (iterable): ``(nchan, nvis)`` array of values to use in the sum over
+                each cell. The default (``values=None``) corresponds to using
+                ``values=np.ones_like(uu)`` such that the routine is equivalent to a
                 histogram.
 
         Returns:
-            A 3D array of size ``(nchan, npix, npix)`` in ground format containing the 
+            A 3D array of size ``(nchan, npix, npix)`` in ground format containing the
             summed cell quantities.
 
         """
@@ -346,11 +355,11 @@ def _sum_cell_values_cube(self, values=None):
 
     def _extract_gridded_values_to_loose(self, gridded_quantity):
         r"""
-        Extract the gridded cell quantity corresponding to each of the loose 
+        Extract the gridded cell quantity corresponding to each of the loose
         visibilities.
 
         Args:
-            A 3D array of size ``(nchan, npix, npix)`` in ground format containing the 
+            A 3D array of size ``(nchan, npix, npix)`` in ground format containing the
             gridded cell quantities.
 
         Returns:
@@ -367,19 +376,19 @@ def _extract_gridded_values_to_loose(self, gridded_quantity):
 
     def _estimate_cell_standard_deviation(self):
         r"""
-        Estimate the `standard deviation 
-        <https://en.wikipedia.org/wiki/Standard_deviation>`__ of the real and imaginary 
-        visibility values within each :math:`u,v` cell (:math:`\mathrm{cell}_{i,j}`) 
+        Estimate the `standard deviation
+        <https://en.wikipedia.org/wiki/Standard_deviation>`__ of the real and imaginary
+        visibility values within each :math:`u,v` cell (:math:`\mathrm{cell}_{i,j}`)
         defined by ``self.coords`` using the following steps.
 
-        1. Calculate the mean real :math:`\mu_\Re` and imaginary :math:`\mu_\Im` values 
+        1. Calculate the mean real :math:`\mu_\Re` and imaginary :math:`\mu_\Im` values
         within each cell using a weighted mean, assuming that the visibility function is
         constant across the cell.
-        2. For each visibility :math:`k` that falls within the cell, calculate the real 
-        and imaginary residuals (:math:`r_\Re` and :math:`r_\Im`) in units of 
-        :math:`\sigma_k`, where :math:`\sigma_k = \sqrt{1/w_k}` and :math:`w_k` is the 
+        2. For each visibility :math:`k` that falls within the cell, calculate the real
+        and imaginary residuals (:math:`r_\Re` and :math:`r_\Im`) in units of
+        :math:`\sigma_k`, where :math:`\sigma_k = \sqrt{1/w_k}` and :math:`w_k` is the
         weight of that visibility.
-        3. Calculate the standard deviation :math:`s_{i,j}` of the residual 
+        3. Calculate the standard deviation :math:`s_{i,j}` of the residual
         distributions within each cell
 
         .. math::
@@ -394,14 +403,14 @@ def _estimate_cell_standard_deviation(self):
 
 
         Returns:
-            std_real, std_imag: two 3D arrays of size ``(nchan, npix, npix)`` in ground 
-            format containing the standard deviation of the real and imaginary values 
-            within each cell, in units of :math:`\sigma`. If everything is correctly 
+            std_real, std_imag: two 3D arrays of size ``(nchan, npix, npix)`` in ground
+            format containing the standard deviation of the real and imaginary values
+            within each cell, in units of :math:`\sigma`. If everything is correctly
             calibrated, we expect :math:`s_{i,j} \approx 1 \forall i,j`.
 
         """
 
-        # 1. use the gridding routine to calculate the mean real and imaginary 
+        # 1. use the gridding routine to calculate the mean real and imaginary
         # values on the grid
         self._grid_visibilities()
 
@@ -409,9 +418,9 @@ def _estimate_cell_standard_deviation(self):
         mu_re_gridded = np.fft.fftshift(self.data_re_gridded, axes=(1, 2))
         mu_im_gridded = np.fft.fftshift(self.data_im_gridded, axes=(1, 2))
 
-        # extract the real and imaginary values corresponding to the 
+        # extract the real and imaginary values corresponding to the
         # "loose" visibilities
-        # mu_re_gridded and mu_im_gridded are arrays with 
+        # mu_re_gridded and mu_im_gridded are arrays with
         # shape (nchan, ncell_v, ncell_u)
         # self.index_v, self.index_u are (nchan, nvis)
         # we want mu_re and mu_im to be (nchan, nvis)
@@ -455,20 +464,20 @@ def _estimate_cell_standard_deviation(self):
 
     def _check_scatter_error(self, max_scatter=1.2):
         """
-        Checks/compares visibility scatter to a given threshold value ``max_scatter`` 
-        and raises an AssertionError if the median scatter across all cells exceeds 
+        Checks/compares visibility scatter to a given threshold value ``max_scatter``
+        and raises an AssertionError if the median scatter across all cells exceeds
         ``max_scatter``.
 
         Args:
-            max_scatter (float): the maximum permissible scatter in units of standard 
+            max_scatter (float): the maximum permissible scatter in units of standard
             deviation.
 
         Returns:
-            a dictionary containing keys ``return_status``, ``median_re``, and 
-            ``median_im``. ``return_status`` is a boolean that is ``False`` if scatter 
-            is within acceptable limits of max_scatter (good), and is ``True`` if 
-            scatter exceeds acceptable limits. ``median_re`` and ``median_im`` are the 
-            median scatter values returned across all cells, in units of standard 
+            a dictionary containing keys ``return_status``, ``median_re``, and
+            ``median_im``. ``return_status`` is a boolean that is ``False`` if scatter
+            is within acceptable limits of max_scatter (good), and is ``True`` if
+            scatter exceeds acceptable limits. ``median_re`` and ``median_im`` are the
+            median scatter values returned across all cells, in units of standard
             deviation (estimated from the provided weights).
 
         """
@@ -494,7 +503,7 @@ def ground_cube(self):
         The visibility FFT cube fftshifted for plotting with ``imshow``.
 
         Returns:
-            (torch.complex tensor, of shape ``(nchan, npix, npix)``): the FFT of the 
+            (torch.complex tensor, of shape ``(nchan, npix, npix)``): the FFT of the
             image cube, in sky plane format.
         """
 
@@ -503,30 +512,30 @@ def ground_cube(self):
 
 class DataAverager(GridderBase):
     r"""
-    The DataAverager object uses desired image dimensions (via the ``cell_size`` and 
-    ``npix`` arguments) to define a corresponding Fourier plane grid as a 
-    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in 
+    The DataAverager object uses desired image dimensions (via the ``cell_size`` and
+    ``npix`` arguments) to define a corresponding Fourier plane grid as a
+    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in
     lieu of ``cell_size`` and ``npix``, but all three arguments should never be supplied
-    at once. For more details on the properties of the grid that is created, see the 
+    at once. For more details on the properties of the grid that is created, see the
     :class:`.GridCoords` documentation.
 
-    The :class:`.DataAverager` object accepts "loose" *ungridded* visibility data and 
-    stores the arrays to the object as instance attributes. The input visibility data 
-    should be the set of visibilities over the full :math:`[-u,u]` and :math:`[-v,v]` 
+    The :class:`.DataAverager` object accepts "loose" *ungridded* visibility data and
+    stores the arrays to the object as instance attributes. The input visibility data
+    should be the set of visibilities over the full :math:`[-u,u]` and :math:`[-v,v]`
     domain, and should not contain Hermitian pairs (an error will be raised, if they are
-    encountered). (Note that, unlike :class:`~mpol.gridding.DirtyImager`, this class 
-    *will not* augment the dataset to include Hermitian pairs. This is by design, 
+    encountered). (Note that, unlike :class:`~mpol.gridding.DirtyImager`, this class
+    *will not* augment the dataset to include Hermitian pairs. This is by design,
     since Hermitian pairs should not be used in likelihood calculations).
 
-    The input visibilities can be 1d for a single continuum channel, or 2d for image 
-    cube. If 1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``. 
+    The input visibilities can be 1d for a single continuum channel, or 2d for image
+    cube. If 1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``.
     Like the :class:`~mpol.images.ImageCube` class, after construction, the DataAverager
-    assumes that you are operating with a multi-channel set of visibilities. These 
-    routines will still work with single-channel 'continuum' visibilities, they will 
+    assumes that you are operating with a multi-channel set of visibilities. These
+    routines will still work with single-channel 'continuum' visibilities, they will
     just have nchan = 1 in the first dimension of most products.
 
     Once the DataAverager object is initialized with loose visibilities, you can average
-    them and export them for use in Regularized Maximum Likelihood imaging with the 
+    them and export them for use in Regularized Maximum Likelihood imaging with the
     :func:`mpol.gridding.DataAverager.to_pytorch_dataset` routine.
 
     Example::
@@ -544,17 +553,17 @@ class DataAverager(GridderBase):
 
 
     Args:
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
-        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates. 
+        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates. 
+        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        weight (2d numpy array): (nchan, nvis) array of thermal weights. 
+        weight (2d numpy array): (nchan, nvis) array of thermal weights.
             Units of [:math:`1/\mathrm{Jy}^2`]
-        data_re (2d numpy array): (nchan, nvis) array of the real part of the 
+        data_re (2d numpy array): (nchan, nvis) array of the real part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
-        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the 
+        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
 
     """
@@ -598,7 +607,7 @@ def _grid_visibilities(self):
     def _grid_weights(self):
         r"""
         Average the visibility weights to the Fourier grid contained in ``self.coords``,
-        such that the ``self.weight_gridded`` corresponds to the equivalent weight on 
+        such that the ``self.weight_gridded`` corresponds to the equivalent weight on
         the averaged visibilities within that cell.
         """
 
@@ -615,13 +624,13 @@ def to_pytorch_dataset(self, check_visibility_scatter=True, max_scatter=1.2):
         Export gridded visibilities to a PyTorch dataset object.
 
         Args:
-            check_visibility_scatter (bool): whether the routine should check the 
-                standard deviation of visibilities in each within each :math:`u,v` cell 
-                (:math:`\mathrm{cell}_{i,j}`) defined by ``self.coords``. Default 
-                is ``True``. A ``RuntimeError`` will be raised if any cell has a 
+            check_visibility_scatter (bool): whether the routine should check the
+                standard deviation of visibilities in each within each :math:`u,v` cell
+                (:math:`\mathrm{cell}_{i,j}`) defined by ``self.coords``. Default
+                is ``True``. A ``RuntimeError`` will be raised if any cell has a
                 scatter larger than ``max_scatter``.
-            max_scatter (float): the maximum allowable standard deviation of visibility 
-                values in a given :math:`u,v` cell (:math:`\mathrm{cell}_{i,j}`) 
+            max_scatter (float): the maximum allowable standard deviation of visibility
+                values in a given :math:`u,v` cell (:math:`\mathrm{cell}_{i,j}`)
                 defined by ``self.coords``. Defaults to a factor of 120%.
 
         Returns:
@@ -655,27 +664,27 @@ def to_pytorch_dataset(self, check_visibility_scatter=True, max_scatter=1.2):
 
 class DirtyImager(GridderBase):
     r"""
-    This class is mainly used for producing diagnostic "dirty" images of the 
+    This class is mainly used for producing diagnostic "dirty" images of the
     visibility data.
 
-    The DirtyImager object uses desired image dimensions (via the ``cell_size`` and 
-    ``npix`` arguments) to define a corresponding Fourier plane grid as a 
-    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in 
+    The DirtyImager object uses desired image dimensions (via the ``cell_size`` and
+    ``npix`` arguments) to define a corresponding Fourier plane grid as a
+    :class:`.GridCoords` object. A pre-computed :class:`.GridCoords` can be supplied in
     lieu of ``cell_size`` and ``npix``, but all three arguments should never be supplied
-    at once. For more details on the properties of the grid that is created, see the 
+    at once. For more details on the properties of the grid that is created, see the
     :class:`.GridCoords` documentation.
 
-    The :class:`.DirtyImager` object accepts "loose" *ungridded* visibility data and 
-    stores the arrays to the object as instance attributes. The input visibility data 
-    should be the normal set of visibilities over the full :math:`[-u,u]` and 
-    :math:`[-v,v]` domain; internally the DirtyImager will automatically augment the 
+    The :class:`.DirtyImager` object accepts "loose" *ungridded* visibility data and
+    stores the arrays to the object as instance attributes. The input visibility data
+    should be the normal set of visibilities over the full :math:`[-u,u]` and
+    :math:`[-v,v]` domain; internally the DirtyImager will automatically augment the
     dataset to include the complex conjugates, i.e. the 'Hermitian pairs.'
 
-    The input visibilities can be 1d for a single continuum channel, or 2d for image 
-    cube. If 1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``. 
-    Like the :class:`~mpol.images.ImageCube` class, after construction, the DirtyImager 
-    assumes that you are operating with a multi-channel set of visibilities. These 
-    routines will still work with single-channel 'continuum' visibilities, they will 
+    The input visibilities can be 1d for a single continuum channel, or 2d for image
+    cube. If 1d, visibilities will be converted to 2d arrays of shape ``(1, nvis)``.
+    Like the :class:`~mpol.images.ImageCube` class, after construction, the DirtyImager
+    assumes that you are operating with a multi-channel set of visibilities. These
+    routines will still work with single-channel 'continuum' visibilities, they will
     just have nchan = 1 in the first dimension of most products.
 
     Example::
@@ -695,17 +704,17 @@ class DirtyImager(GridderBase):
     Args:
         cell_size (float): width of a single square pixel in [arcsec]
         npix (int): number of pixels in the width of the image
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
-        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates. 
+        uu (numpy array): (nchan, nvis) array of u spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates. 
+        vv (numpy array): (nchan, nvis) array of v spatial frequency coordinates.
             Units of [:math:`\mathrm{k}\lambda`]
-        weight (2d numpy array): (nchan, nvis) array of thermal weights. 
+        weight (2d numpy array): (nchan, nvis) array of thermal weights.
             Units of [:math:`1/\mathrm{Jy}^2`]
-        data_re (2d numpy array): (nchan, nvis) array of the real part of the 
+        data_re (2d numpy array): (nchan, nvis) array of the real part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
-        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the 
+        data_im (2d numpy array): (nchan, nvis) array of the imaginary part of the
             visibility measurements. Units of [:math:`\mathrm{Jy}`]
 
     """
@@ -776,17 +785,17 @@ def _grid_visibilities(
         Grid the loose data visibilities to the Fourier grid in preparation for imaging.
 
         Args:
-            weighting (string): The type of cell averaging to perform. Choices of 
-                ``"natural"``, ``"uniform"``, or ``"briggs"``, following CASA tclean. 
+            weighting (string): The type of cell averaging to perform. Choices of
+                ``"natural"``, ``"uniform"``, or ``"briggs"``, following CASA tclean.
                 If ``"briggs"``, also specify a robust value.
-            robust (float): If ``weighting='briggs'``, specify a robust value in the 
-                range [-2, 2]. ``robust=-2`` approximately corresponds to uniform 
-                weighting and ``robust=2`` approximately corresponds to natural 
+            robust (float): If ``weighting='briggs'``, specify a robust value in the
+                range [-2, 2]. ``robust=-2`` approximately corresponds to uniform
+                weighting and ``robust=2`` approximately corresponds to natural
                 weighting.
-            taper_function (function reference): a function assumed to be of the form 
-                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]` 
-                and premultiplies the visibility data. The function must assume that 
-                :math:`u` and :math:`v` will be supplied in units of 
+            taper_function (function reference): a function assumed to be of the form
+                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]`
+                and premultiplies the visibility data. The function must assume that
+                :math:`u` and :math:`v` will be supplied in units of
                 :math:`\mathrm{k}\lambda`. By default no taper is applied.
         """
 
@@ -879,16 +888,16 @@ def _get_dirty_beam(self, C, re_gridded_beam):
 
         Args:
             C (1D np.array): normalization constants for each channel
-            re_gridded_beam (3d np.array): the gridded visibilities corresponding to a 
+            re_gridded_beam (3d np.array): the gridded visibilities corresponding to a
                 unit point source in the center of the field.
 
         Returns:
-            numpy image cube with a dirty beam (PSF) for each channel. By definition, 
+            numpy image cube with a dirty beam (PSF) for each channel. By definition,
             the peak is normalized to 1.0.
         """
         # if we're sticking to the dirty beam and image equations in Briggs' Ph.D. thesis,
         # no correction for du or dv prefactors needed here
-        # that is because we are using the FFT to compute an already discretized 
+        # that is because we are using the FFT to compute an already discretized
         # equation, not approximating a continuous equation.
 
         beam = self._fliplr_cube(
@@ -912,18 +921,18 @@ def _get_dirty_beam(self, C, re_gridded_beam):
         return self.beam
 
     def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
-        r"""Zero out (null) all pixels in the dirty beam exterior to the first null, 
+        r"""Zero out (null) all pixels in the dirty beam exterior to the first null,
         for each channel.
 
         Args:
-            ntheta (int): number of azimuthal wedges to use for the 1st null 
-                calculation. More wedges will result in a more accurate estimate of 
+            ntheta (int): number of azimuthal wedges to use for the 1st null
+                calculation. More wedges will result in a more accurate estimate of
                 dirty beam area, but will also take longer.
-            single_channel_estimate (bool): If ``True`` (the default), use the area 
-                estimated from the first channel for all channels in the multi-channel 
+            single_channel_estimate (bool): If ``True`` (the default), use the area
+                estimated from the first channel for all channels in the multi-channel
                 image cube. If ``False``, calculate the beam area for all channels.
 
-        Returns: a cube like the dirty beam, but with all pixels exterior to the first 
+        Returns: a cube like the dirty beam, but with all pixels exterior to the first
             null set to 0.
         """
 
@@ -934,12 +943,12 @@ def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
 
         # consider the 2D beam for each channel described by polar coordinates r, theta.
         #
-        # this routine works by finding the smallest r for which the beam goes negative 
+        # this routine works by finding the smallest r for which the beam goes negative
         # (the first null) as a function of theta. Then, for this same theta, all pixels
         # (negative or not) with values of r larger than
         # this are set to 0.
 
-        # the end product of this routine will be a "nulled" beam, which can be used in 
+        # the end product of this routine will be a "nulled" beam, which can be used in
         # the calculation of dirty beam area.
 
         # the angular extent for each "slice"
@@ -955,7 +964,7 @@ def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
 
         nulled_beam = self.beam.copy()
         # for each channel,
-        # find the first occurrence of a non-zero value, such that we end up with a 
+        # find the first occurrence of a non-zero value, such that we end up with a
         # continuous ring of masked values.
         for i in range(self.nchan):
             nb = nulled_beam[i]
@@ -986,25 +995,25 @@ def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
 
     def get_dirty_beam_area(self, ntheta=24, single_channel_estimate=True):
         r"""
-        Compute the effective area of the dirty beam for each channel. Assumes that the 
-        beam has already been generated by running 
-        :func:`~mpol.gridding.DirtyImager.get_dirty_image`. This is an approximate 
-        calculation involving a simple sum over all pixels out to the first null 
-        (zero crossing) of the dirty beam. This quantity is designed to approximate the 
-        conversion of image units from :math:`[\mathrm{Jy}\,\mathrm{beam}^{-1}]` to 
-        :math:`[\mathrm{Jy}\,\mathrm{arcsec}^{-2}]`, even though units of 
+        Compute the effective area of the dirty beam for each channel. Assumes that the
+        beam has already been generated by running
+        :func:`~mpol.gridding.DirtyImager.get_dirty_image`. This is an approximate
+        calculation involving a simple sum over all pixels out to the first null
+        (zero crossing) of the dirty beam. This quantity is designed to approximate the
+        conversion of image units from :math:`[\mathrm{Jy}\,\mathrm{beam}^{-1}]` to
+        :math:`[\mathrm{Jy}\,\mathrm{arcsec}^{-2}]`, even though units of
         :math:`[\mathrm{Jy}\,\mathrm{dirty\;beam}^{-1}]` are technically undefined.
 
         Args:
-            ntheta (int): number of azimuthal wedges to use for the 1st null 
-                calculation. More wedges will result in a more accurate estimate of 
+            ntheta (int): number of azimuthal wedges to use for the 1st null
+                calculation. More wedges will result in a more accurate estimate of
                 dirty beam area, but will also take longer.
-            single_channel_estimate (bool): If ``True`` (the default), use the area 
-                estimated from the first channel for all channels in the multi-channel 
+            single_channel_estimate (bool): If ``True`` (the default), use the area
+                estimated from the first channel for all channels in the multi-channel
                 image cube. If ``False``, calculate the beam area for all channels.
 
         Returns:
-            (1D numpy array float) beam area for each channel in units of 
+            (1D numpy array float) beam area for each channel in units of
             :math:`[\mathrm{arcsec}^{2}]`
         """
         nulled = self._null_dirty_beam(
@@ -1026,37 +1035,37 @@ def get_dirty_image(
         Calculate the dirty image.
 
         Args:
-            weighting (string): The type of cell averaging to perform. Choices of 
-                ``"natural"``, ``"uniform"``, or ``"briggs"``, following CASA tclean. 
+            weighting (string): The type of cell averaging to perform. Choices of
+                ``"natural"``, ``"uniform"``, or ``"briggs"``, following CASA tclean.
                 If ``"briggs"``, also specify a robust value.
-            robust (float): If ``weighting='briggs'``, specify a robust value in the 
-                range [-2, 2]. ``robust=-2`` approxmately corresponds to uniform 
-                weighting and ``robust=2`` approximately corresponds to natural 
+            robust (float): If ``weighting='briggs'``, specify a robust value in the
+                range [-2, 2]. ``robust=-2`` approxmately corresponds to uniform
+                weighting and ``robust=2`` approximately corresponds to natural
                 weighting.
-            taper_function (function reference): a function assumed to be of the form 
-                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]` 
-                and premultiplies the visibility data. The function must assume that 
-                :math:`u` and :math:`v` will be supplied in units of 
+            taper_function (function reference): a function assumed to be of the form
+                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]`
+                and premultiplies the visibility data. The function must assume that
+                :math:`u` and :math:`v` will be supplied in units of
                 :math:`\mathrm{k}\lambda`. By default no taper is applied.
-            unit (string): what unit should the image be in. 
-                Default is ``"Jy/beam"``. If ``"Jy/arcsec^2"``, then the effective area 
-                of the dirty beam will be used to convert from ``"Jy/beam"`` to 
+            unit (string): what unit should the image be in.
+                Default is ``"Jy/beam"``. If ``"Jy/arcsec^2"``, then the effective area
+                of the dirty beam will be used to convert from ``"Jy/beam"`` to
                 ``"Jy/arcsec^2"``.
-            check_visibility_scatter (bool): whether the routine should check the 
-                standard deviation of visibilities in each within each :math:`u,v` cell 
-                (:math:`\mathrm{cell}_{i,j}`) defined by ``self.coords``. Default is 
-                ``True``. A ``RuntimeWarning`` will be raised if any cell has a scatter 
+            check_visibility_scatter (bool): whether the routine should check the
+                standard deviation of visibilities in each within each :math:`u,v` cell
+                (:math:`\mathrm{cell}_{i,j}`) defined by ``self.coords``. Default is
+                ``True``. A ``RuntimeWarning`` will be raised if any cell has a scatter
                 larger than ``max_scatter``.
-            max_scatter (float): the maximum allowable standard deviation of visibility 
+            max_scatter (float): the maximum allowable standard deviation of visibility
                 values in a given :math:`u,v` cell (:math:`\mathrm{cell}_{i,j}`) defined
                 by ``self.coords``. Defaults to a factor of 120%.
-            **beam_kwargs: all additional keyword arguments passed to 
+            **beam_kwargs: all additional keyword arguments passed to
                 :func:`~mpol.gridding.get_dirty_beam_area` if ``unit="Jy/arcsec^2"``.
 
         Returns:
-            2-tuple of (``image``, ``beam``) where ``image`` is an (nchan, npix, npix) 
-            numpy array of the dirty image cube in units ``unit``. ``beam`` is an numpy 
-            image cube with a dirty beam (PSF) for each channel. The units of the beam 
+            2-tuple of (``image``, ``beam``) where ``image`` is an (nchan, npix, npix)
+            numpy array of the dirty image cube in units ``unit``. ``beam`` is an numpy
+            image cube with a dirty beam (PSF) for each channel. The units of the beam
             are always Jy/{dirty beam}, i.e., the peak of the beam is normalized to 1.0.
         """
 
@@ -1100,9 +1109,9 @@ def get_dirty_image(
         # also pre-stores internal self.beam value for area routine, if necessary
         beam = self._get_dirty_beam(self.C, self.re_gridded_beam)
 
-        # for units of Jy/arcsec^2, we could just leave out the C constant *if* we were 
-        # doing uniform weighting. The relationships get more complex for robust or 
-        # natural weighting, however, so it's safer to calculate the number of 
+        # for units of Jy/arcsec^2, we could just leave out the C constant *if* we were
+        # doing uniform weighting. The relationships get more complex for robust or
+        # natural weighting, however, so it's safer to calculate the number of
         # arcseconds^2 per beam
         if unit == "Jy/arcsec^2":
             beam_area_per_chan = self.get_dirty_beam_area(**beam_kwargs)  # [arcsec^2]

From 00d3ac09c84fbd91d4e3472d1792d74eba668b66 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Tue, 26 Dec 2023 23:20:54 +0000
Subject: [PATCH 17/26] removed unused protocols.

---
 src/mpol/fourier.py   |  1 -
 src/mpol/protocols.py | 18 ------------------
 2 files changed, 19 deletions(-)
 delete mode 100644 src/mpol/protocols.py

diff --git a/src/mpol/fourier.py b/src/mpol/fourier.py
index 5ce6b4cf..ade7e89b 100644
--- a/src/mpol/fourier.py
+++ b/src/mpol/fourier.py
@@ -15,7 +15,6 @@
 
 from mpol.exceptions import DimensionMismatchError
 from mpol.images import ImageCube
-from mpol.protocols import MPoLModel
 
 from mpol import utils
 from mpol.coordinates import GridCoords
diff --git a/src/mpol/protocols.py b/src/mpol/protocols.py
deleted file mode 100644
index 5ca83cee..00000000
--- a/src/mpol/protocols.py
+++ /dev/null
@@ -1,18 +0,0 @@
-from __future__ import annotations
-
-from typing import Protocol
-
-import mpol.coordinates
-import mpol.fourier
-import mpol.images
-
-
-class MPoLModel(Protocol):
-    coords: mpol.coordinates.GridCoords
-    nchan: int
-    bcube: mpol.images.BaseCube
-    icube: mpol.images.ImageCube
-    fcube: mpol.fourier.FourierCube
-
-    def forward(self):
-        ...

From 06f3ba5399df32457437a9c38a855f73ce6eef06 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 12:57:54 +0000
Subject: [PATCH 18/26] added changelog plans for removing
 from_image_properties.

---
 docs/changelog.md  | 11 ++++++++++
 src/mpol/images.py | 53 +++++++++++++++++++++++-----------------------
 2 files changed, 37 insertions(+), 27 deletions(-)

diff --git a/docs/changelog.md b/docs/changelog.md
index 17f867f5..72cabc6b 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -5,6 +5,17 @@
 ## v0.2.1
 
 - *Placeholder* Planned changes described by Architecture GitHub Project.
+- Removed convenience classmethods `from_image_properties` from across the code base. From [#233](https://github.com/MPoL-dev/MPoL/issues/233). The recommended workflow is to create a :class:`mpol.coordinates.GridCoords` object and pass that to instantiate these objects as needed. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:`mpol.coordinates.GridCoords` object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain and document multiple entry points to key `nn.modules`. We removed `from_image_properties` from
+  - :class:`mpol.datasets.GriddedDataset`
+  - :class:`mpol.datasets.Dartboard` 
+  - :class:`mpol.fourier.NuFFT`
+  - :class:`mpol.fourier.NuFFTCached` 
+  - :class:`mpol.fourier.FourierCube` 
+  - :class:`mpol.gridding.GridderBase` 
+  - :class:`mpol.gridding.DataAverager`
+  - :class:`mpol.gridding.DirtyImager`
+  - :class:`mpol.images.BaseCube`
+  - :class:`mpol.images.ImageCube`
 - Removed unused routine `mpol.utils.log_stretch`.
 - Added type hints for core modules ([#54](https://github.com/MPoL-dev/MPoL/issues/54)). This should improve stability of core routines and help users when writing code using MPoL in an IDE.
 - Manually line wrapped many docstrings to conform to 88 characters per line or less. Ian thought `black` would do this by default, but actually that [doesn't seem to be the case](https://github.com/psf/black/issues/2865).
diff --git a/src/mpol/images.py b/src/mpol/images.py
index dcfdbbbd..1b14b319 100644
--- a/src/mpol/images.py
+++ b/src/mpol/images.py
@@ -8,6 +8,8 @@
 import torch.fft  # to avoid conflicts with old torch.fft *function*
 from torch import nn
 
+from typing import Callable
+
 from mpol import utils
 from mpol.coordinates import GridCoords
 
@@ -25,28 +27,31 @@ class BaseCube(nn.Module):
     The ``base_cube`` pixel values are set as PyTorch `parameters
     <https://pytorch.org/docs/stable/generated/torch.nn.parameter.Parameter.html>`_.
 
-    Args:
-        cell_size (float): the width of a pixel [arcseconds]
-        npix (int): the number of pixels per image side
-        coords (GridCoords): an object already instantiated from the GridCoords class.
-            If providing this, cannot provide ``cell_size`` or ``npix``.
-        nchan (int): the number of channels in the base cube. Default = 1.
-        pixel_mapping (torch.nn): a PyTorch function mapping the base pixel
-            representation to the cube representation. If `None`, defaults to
-            `torch.nn.Softplus()`. Output of the function should be in units of
-            [:math:`\mathrm{Jy}\,\mathrm{arcsec}^{-2}`].
-        base_cube (torch.double tensor, optional): a pre-packed base cube to initialize
-            the model with. If None, assumes ``torch.zeros``. See
-            :ref:`cube-orientation-label` for more information on the expectations of
-            the orientation of the input image.
+    Parameters
+    ----------
+    coords : :class:`mpol.coordinates.GridCoords`
+        an object instantiated from the GridCoords class, containing information about
+        the image `cell_size` and `npix`.
+    nchan : int
+        the number of channels in the base cube. Default = 1.
+    pixel_mapping : function
+        a PyTorch function mapping the base pixel
+        representation to the cube representation. If `None`, defaults to
+        `torch.nn.Softplus()`. Output of the function should be in units of
+        [:math:`\mathrm{Jy}\,\mathrm{arcsec}^{-2}`].
+    base_cube : torch.double tensor, optional
+        a pre-packed base cube to initialize
+        the model with. If None, assumes ``torch.zeros``. See
+        :ref:`cube-orientation-label` for more information on the expectations of
+        the orientation of the input image.
     """
 
     def __init__(
         self,
-        coords=None,
-        nchan=1,
-        pixel_mapping=None,
-        base_cube=None,
+        coords: GridCoords,
+        nchan: int = 1,
+        pixel_mapping: Callable[[torch.Tensor], torch.Tensor] | None = None,
+        base_cube: torch.Tensor | None = None,
     ):
         super().__init__()
 
@@ -75,17 +80,11 @@ def __init__(
         if pixel_mapping is None:
             self.pixel_mapping = torch.nn.Softplus()
         else:
-            # TODO assert that this is a PyTorch function
+            # TODO assert that this is a PyTorch function (and not a numpy function,
+            # for example)
             self.pixel_mapping = pixel_mapping
 
-    @classmethod
-    def from_image_properties(
-        cls, cell_size, npix, nchan=1, pixel_mapping=None, base_cube=None
-    ) -> BaseCube:
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, nchan, pixel_mapping, base_cube)
-
-    def forward(self):
+    def forward(self) -> torch.Tensor:
         r"""
         Calculate the image representation from the ``base_cube`` using the pixel
         mapping

From 115707419b99d5c9e0d51ae32f45a502fa535d07 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 14:47:21 +0000
Subject: [PATCH 19/26] removed from_image_properties and updated tests and
 docs.

---
 docs/changelog.md                       |   2 +-
 docs/ci-tutorials/fakedata.md           |   4 +-
 docs/ci-tutorials/gridder.md            |  16 +--
 docs/large-tutorials/HD143006_part_1.md |   7 +-
 src/mpol/datasets.py                    |  67 ----------
 src/mpol/fourier.py                     |  66 ----------
 src/mpol/gridding.py                    | 162 ++++++++++++------------
 src/mpol/images.py                      |  89 ++++++-------
 src/mpol/precomposed.py                 |   5 -
 test/gridder_dataset_export_test.py     |   6 +-
 test/gridder_gridding_test.py           |  28 ++--
 test/gridder_imager_test.py             |   6 +-
 test/gridder_init_test.py               |   6 +-
 test/images_test.py                     |  32 +----
 14 files changed, 166 insertions(+), 330 deletions(-)

diff --git a/docs/changelog.md b/docs/changelog.md
index 72cabc6b..f4229f85 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -5,7 +5,7 @@
 ## v0.2.1
 
 - *Placeholder* Planned changes described by Architecture GitHub Project.
-- Removed convenience classmethods `from_image_properties` from across the code base. From [#233](https://github.com/MPoL-dev/MPoL/issues/233). The recommended workflow is to create a :class:`mpol.coordinates.GridCoords` object and pass that to instantiate these objects as needed. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:`mpol.coordinates.GridCoords` object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain and document multiple entry points to key `nn.modules`. We removed `from_image_properties` from
+- Removed convenience classmethods `from_image_properties` from across the code base. From [#233](https://github.com/MPoL-dev/MPoL/issues/233). The recommended workflow is to create a :class:`mpol.coordinates.GridCoords` object and pass that to instantiate these objects as needed, rather than passing `cell_size` and `npix` separately. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:`mpol.coordinates.GridCoords` object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain, test, and document multiple entry points to key `nn.modules`. We removed `from_image_properties` from
   - :class:`mpol.datasets.GriddedDataset`
   - :class:`mpol.datasets.Dartboard` 
   - :class:`mpol.fourier.NuFFT`
diff --git a/docs/ci-tutorials/fakedata.md b/docs/ci-tutorials/fakedata.md
index d35eb887..26d60ea7 100644
--- a/docs/ci-tutorials/fakedata.md
+++ b/docs/ci-tutorials/fakedata.md
@@ -253,7 +253,9 @@ img_tensor_packed = utils.sky_cube_to_packed_cube(img_tensor)
 
 ```{code-cell} ipython3
 from mpol.images import ImageCube
-image = ImageCube.from_image_properties(cell_size=cell_size, npix=npix, nchan=1, cube=img_tensor_packed)
+from mpol import coordinates
+coords = coordinates.GridCoords(cell_size=cell_size, npix=npix)
+image = ImageCube(coords=coords, nchan=1, cube=img_tensor_packed)
 ```
 
 If you want to double-check that the image was correctly inserted, you can do
diff --git a/docs/ci-tutorials/gridder.md b/docs/ci-tutorials/gridder.md
index e8624d02..b4ba4353 100644
--- a/docs/ci-tutorials/gridder.md
+++ b/docs/ci-tutorials/gridder.md
@@ -155,21 +155,7 @@ imager = gridding.DirtyImager(
 )
 ```
 
-Instantiating the {class}`~mpol.gridding.DirtyImager` object attaches the {class}`~mpol.coordinates.GridCoords`  object and the loose visibilities. There is also a convenience method to create the {class}`~mpol.coordinates.GridCoords` and {class}`~mpol.gridding.DirtyImager` object in one shot by
-
-```{code-cell}
-imager = gridding.DirtyImager.from_image_properties(
-    cell_size=0.005,  # [arcsec]
-    npix=800,
-    uu=uu,
-    vv=vv,
-    weight=weight,
-    data_re=data_re,
-    data_im=data_im,
-)
-```
-
-if you don't want to specify your {class}`~mpol.coordinates.GridCoords` object separately.
+Instantiating the {class}`~mpol.gridding.DirtyImager` object attaches the {class}`~mpol.coordinates.GridCoords`  object and the loose visibilities. 
 
 As we saw, the raw visibility dataset is a set of complex-valued Fourier samples. Our objective is to make images of the sky-brightness distribution and do astrophysics. We'll cover how to do this with MPoL and RML techniques in later tutorials, but it is possible to get a rough idea of the sky brightness by calculating the inverse Fourier transform of the visibility values.
 
diff --git a/docs/large-tutorials/HD143006_part_1.md b/docs/large-tutorials/HD143006_part_1.md
index debf213e..3924a0fd 100644
--- a/docs/large-tutorials/HD143006_part_1.md
+++ b/docs/large-tutorials/HD143006_part_1.md
@@ -150,11 +150,10 @@ The FITS image was a full 3000x3000 pixels. In general, it is good practice to s
 Since the DSHARP team has already checked there are no bright sub-mm sources in the FOV, we can save time and just make a smaller image corresponding to the protoplanetary emission. If `cell_size` is 0.003 arcseconds, `npix=512` pixels should be sufficient to make an image approximately 1.5 arcseconds on a side. Now, let's import the relevant MPoL routines and instantiate the Gridder.
 
 ```{code-cell}
-from mpol import gridding
+from mpol import coordinates, gridding
 
-imager = gridding.DirtyImager.from_image_properties(
-    cell_size=cell_size,
-    npix=512,
+coords = coordinates.GridCoords(cell_size=cell_size, npix=512)
+imager = gridding.DirtyImager(
     uu=uu,
     vv=vv,
     weight=weight,
diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index ec0bb623..29c90afb 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -73,46 +73,6 @@ def __init__(
         self.vis_indexed: torch.Tensor
         self.weight_indexed: torch.Tensor
 
-    @classmethod
-    def from_image_properties(
-        cls,
-        cell_size: float,
-        npix: int,
-        *,
-        vis_gridded: torch.Tensor,
-        weight_gridded: torch.Tensor,
-        mask: torch.Tensor,
-        nchan: int = 1,
-    ) -> GriddedDataset:
-        """
-            Alternative method to instantiate a GriddedDataset object from cell_size
-            and npix.
-
-            Parameters
-            ----------
-            cell_size : float 
-                the width of a pixel [arcseconds]
-            npix : int
-                the number of pixels per image side
-            vis_gridded : :class:`torch.Tensor` of :class:`torch.complex128`
-                the gridded visibility data stored in a "packed" format (pre-shifted for fft)
-            weight_gridded : :class:`torch.Tensor` of :class:`torch.double`
-                the weights corresponding to the gridded visibility data,
-                also in a packed format
-            mask : :class:`torch.Tensor` of :class:`torch.bool`
-                a boolean mask to index the non-zero locations of ``vis_gridded`` and
-                ``weight_gridded`` in their packed format.
-            nchan : int
-                the number of channels in the image (default = 1).
-        """
-        return cls(
-            coords=GridCoords(cell_size, npix),
-            vis_gridded=vis_gridded,
-            weight_gridded=weight_gridded,
-            mask=mask,
-            nchan=nchan,
-        )
-
     def add_mask(
         self,
         mask: ArrayLike,
@@ -247,33 +207,6 @@ def cartesian_phis(self) -> NDArray[floating[Any]]:
     def q_max(self) -> float:
         return self.coords.q_max
 
-    @classmethod
-    def from_image_properties(
-        cls,
-        cell_size: float,
-        npix: int,
-        q_edges: NDArray[floating[Any]] | None = None,
-        phi_edges: NDArray[floating[Any]] | None = None,
-    ) -> Dartboard:
-        """Alternative method to instantiate a Dartboard object from cell_size
-        and npix.
-
-        Args:
-            cell_size (float): the width of a pixel [arcseconds]
-            npix (int): the number of pixels per image side
-            q_edges (1D numpy array): an array of radial bin edges to set the
-                dartboard cells in :math:`[\mathrm{k}\lambda]`. If ``None``, defaults
-                to 12 log-linearly radial bins stretching from 0 to the
-                :math:`q_\mathrm{max}` represented by ``coords``.
-            phi_edges (1D numpy array): an array of azimuthal bin edges to set the
-                dartboard cells in [radians], over the domain :math:`[0, \pi]`, which
-                is also implicitly mapped to the domain :math:`[-\pi, \pi]` to preserve
-                the Hermitian nature of the visibilities. If ``None``, defaults to 8
-                equal-spaced azimuthal bins stretched from :math:`0` to :math:`\pi`.
-        """
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, q_edges, phi_edges)
-
     def get_polar_histogram(
         self, qs: NDArray[floating[Any]], phis: NDArray[floating[Any]]
     ) -> NDArray[floating[Any]]:
diff --git a/src/mpol/fourier.py b/src/mpol/fourier.py
index ade7e89b..706d3c67 100644
--- a/src/mpol/fourier.py
+++ b/src/mpol/fourier.py
@@ -49,33 +49,6 @@ def __init__(self, coords: GridCoords, persistent_vis: bool = False):
         self.register_buffer("vis", None, persistent=persistent_vis)
         self.vis: torch.Tensor
 
-    @classmethod
-    def from_image_properties(
-        cls, cell_size: float, npix: int, persistent_vis: bool = False
-    ) -> FourierCube:
-        """
-        Alternative method for instantiating a FourierCube from ``cell_size`` and
-        ``npix``
-
-        Parameters
-        ----------
-        cell_size : float
-            the width of an image-plane pixel [arcseconds]
-        npix : int)
-            the number of pixels per image side
-        persistent_vis : bool
-            should the visibility cube be stored as part of
-            the modules `state_dict`? If `True`, the state of the UV grid will be
-            stored. It is recommended to use `False` for most applications, since
-            the visibility cube will rarely be a direct parameter of the model.
-
-        Returns
-        -------
-        :class:`mpol.fourier.FourierCube`
-        """
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, persistent_vis)
-
     def forward(self, cube: torch.Tensor) -> torch.Tensor:
         """
         Perform the FFT of the image cube on each channel.
@@ -314,32 +287,6 @@ def __init__(
             im_size=(self.coords.npix, self.coords.npix)
         )
 
-    @classmethod
-    def from_image_properties(
-        cls,
-        cell_size: float,
-        npix: int,
-        nchan: int = 1,
-    ):
-        """
-        Instantiate a :class:`mpol.fourier.NuFFT` object from image properties rather
-        than a :meth:`mpol.coordinates.GridCoords` instance.
-
-        Args:
-            cell_size (float): the width of an image-plane pixel [arcseconds]
-            npix (int): the number of pixels per image side
-            nchan (int): the number of channels in the :class:`mpol.images.ImageCube`.
-                Default = 1.
-
-        Returns:
-            an instance of the :class:`mpol.fourier.NuFFT`
-        """
-        coords = GridCoords(cell_size, npix)
-        return cls(
-            coords,
-            nchan,
-        )
-
     def _klambda_to_radpix(self, klambda: torch.Tensor) -> torch.Tensor:
         """Convert a spatial frequency in units of klambda to 'radians/sky pixel,'
         using the pixel cell_size provided by ``self.coords.dl``.
@@ -752,19 +699,6 @@ def __init__(
             self.real_interp_mat: torch.Tensor
             self.imag_interp_mat: torch.Tensor
 
-    @classmethod
-    def from_image_properties(
-        cls,
-        cell_size,
-        npix,
-        uu,
-        vv,
-        nchan=1,
-        sparse_matrices=True,
-    ):
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, uu, vv, nchan, sparse_matrices)
-
     def forward(self, cube):
         r"""
         Perform the FFT of the image cube for each channel and interpolate to the
diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index d99c33f5..ea067788 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -6,7 +6,7 @@
 
 import warnings
 
-from typing import Any
+from typing import Any, Sequence
 
 import numpy as np
 import numpy.typing as npt
@@ -18,12 +18,12 @@
 
 
 def _check_data_inputs_2d(
-    uu=npt.NDArray[np.floating[Any]] | None,
-    vv=npt.NDArray[np.floating[Any]] | None,
-    weight=npt.NDArray[np.floating[Any]] | None,
-    data_re=npt.NDArray[np.floating[Any]] | None,
-    data_im=npt.NDArray[np.floating[Any]] | None,
-):
+    uu=npt.NDArray[np.floating[Any]],
+    vv=npt.NDArray[np.floating[Any]],
+    weight=npt.NDArray[np.floating[Any]],
+    data_re=npt.NDArray[np.floating[Any]],
+    data_im=npt.NDArray[np.floating[Any]],
+) -> tuple[np.ndarray, ...]:
     """
     Check that all data inputs are the same shape, the weights are positive, and the
     data_re and data_im are floats.
@@ -67,7 +67,13 @@ def _check_data_inputs_2d(
     return uu, vv, weight, data_re, data_im
 
 
-def verify_no_hermitian_pairs(uu, vv, data, test_vis=5, test_channel=0):
+def verify_no_hermitian_pairs(
+    uu: npt.NDArray[np.floating[Any]],
+    vv: npt.NDArray[np.floating[Any]],
+    data: npt.NDArray[np.complexfloating[Any, Any]],
+    test_vis: int = 5,
+    test_channel: int = 0,
+) -> bool:
     r"""
     Check that the dataset does not contain Hermitian pairs. Because the sky brightness
     :math:`I_\nu` is real, the visibility function :math:`\mathcal{V}` is Hermitian,
@@ -153,27 +159,20 @@ def verify_no_hermitian_pairs(uu, vv, data, test_vis=5, test_channel=0):
             num_pairs += 1
 
     if num_pairs == 0:
-        return
+        return True
 
     if num_pairs == test_vis:
         raise DataError(
             "Hermitian pairs were found in the data. Please provide data without"
             " Hermitian pairs."
         )
+        return False
     else:
         raise DataError(
             f"{num_pairs} Hermitian pairs were found out of {test_vis} visibilities"
             " tested, dataset is inconsistent."
         )
-
-    # choose a uu, vv point, then see if the opposite value exists in the dataset
-    # if it does, then check that its visibility value is the complex conjugate
-
-    # we could have a max threshold, i.e., like at least 5 need to exist to say
-    # the dataset has pairs
-
-    # Subtract
-    return False
+        return False
 
 
 class GridderBase:
@@ -218,13 +217,13 @@ class GridderBase:
 
     def __init__(
         self,
-        coords=None,
-        uu=None,
-        vv=None,
-        weight=None,
-        data_re=None,
-        data_im=None,
-    ):
+        coords=GridCoords,
+        uu=npt.NDArray[np.floating[Any]],
+        vv=npt.NDArray[np.floating[Any]],
+        weight=npt.NDArray[np.floating[Any]],
+        data_re=npt.NDArray[np.floating[Any]],
+        data_im=npt.NDArray[np.floating[Any]],
+    ) -> None:
         # check everything should be 2d, expand if not
         # also checks data does not contain Hermitian pairs
         uu, vv, weight, data_re, data_im = _check_data_inputs_2d(
@@ -249,21 +248,7 @@ def __init__(
         # and register cell indices against data
         self._create_cell_indices()
 
-    @classmethod
-    def from_image_properties(
-        cls,
-        cell_size,
-        npix,
-        uu=None,
-        vv=None,
-        weight=None,
-        data_re=None,
-        data_im=None,
-    ) -> GridderBase:
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, uu, vv, weight, data_re, data_im)
-
-    def _create_cell_indices(self):
+    def _create_cell_indices(self) -> None:
         # figure out which visibility cell each datapoint lands in, so that
         # we can later assign it the appropriate robust weight for that cell
         # do this by calculating the nearest cell index [0, N] for all samples
@@ -275,7 +260,12 @@ def _create_cell_indices(self):
             [np.digitize(v_chan, self.coords.v_edges) - 1 for v_chan in self.vv]
         )
 
-    def _sum_cell_values_channel(self, uu, vv, values=None):
+    def _sum_cell_values_channel(
+        self,
+        uu: npt.NDArray[np.floating[Any]],
+        vv: npt.NDArray[np.floating[Any]],
+        values: npt.NDArray[np.floating[Any]] | None = None,
+    ) -> npt.NDArray[np.floating[Any]]:
         r"""
         Given a list of loose visibility points :math:`(u,v)` and their corresponding
         values :math:`x`, partition the points up into 2D :math:`u-v` cells defined by
@@ -308,7 +298,7 @@ def _sum_cell_values_channel(self, uu, vv, values=None):
             cell quantities.
         """
 
-        result = fast_hist.histogram2d(
+        result: npt.NDArray[np.floating[Any]] = fast_hist.histogram2d(
             vv,
             uu,
             bins=self.coords.ncell_u,
@@ -322,7 +312,9 @@ def _sum_cell_values_channel(self, uu, vv, values=None):
         # only return the "H" value
         return result
 
-    def _sum_cell_values_cube(self, values=None):
+    def _sum_cell_values_cube(
+        self, values: npt.NDArray[np.floating[Any]] | Sequence[None] | None = None
+    ) -> npt.NDArray[np.floating[Any]]:
         r"""
         Perform the :func:`~mpol.gridding.DataAverager.sum_cell_values_channel` routine
         over all channels of the input visibilities.
@@ -353,7 +345,9 @@ def _sum_cell_values_cube(self, values=None):
 
         return cube
 
-    def _extract_gridded_values_to_loose(self, gridded_quantity):
+    def _extract_gridded_values_to_loose(
+        self, gridded_quantity: npt.NDArray[np.floating[Any]]
+    ) -> npt.NDArray[np.floating[Any]]:
         r"""
         Extract the gridded cell quantity corresponding to each of the loose
         visibilities.
@@ -374,7 +368,45 @@ def _extract_gridded_values_to_loose(self, gridded_quantity):
             ]
         )
 
-    def _estimate_cell_standard_deviation(self):
+    def _grid_visibilities(self) -> None:
+        r"""
+        Average the loose data visibilities to the Fourier grid.
+        """
+
+        # create the cells as edges around the existing points
+        # note that at this stage, the UV grid is strictly increasing
+        # when in fact, later on, we'll need to fftshift for the FFT
+        cell_weight = self._sum_cell_values_cube(self.weight)
+
+        # boolean index for cells that *contain* visibilities
+        mask = cell_weight > 0.0
+
+        # calculate the density weights under "uniform"
+        # the density weights have the same shape as the re, im samples.
+        # cell_weight is (nchan, ncell_v, ncell_u)
+        # self.index_v, self.index_u are (nchan, nvis)
+        # we want density_weights to be (nchan, nvis)
+        density_weight = 1 / self._extract_gridded_values_to_loose(cell_weight)
+
+        # grid the reals and imaginaries separately
+        # outputs from _sum_cell_values_cube are *not* pre-packed
+        data_re_gridded = self._sum_cell_values_cube(
+            self.data_re * density_weight * self.weight
+        )
+
+        data_im_gridded = self._sum_cell_values_cube(
+            self.data_im * density_weight * self.weight
+        )
+
+        # store the pre-packed FFT products for access by outside routines
+        self.mask = np.fft.fftshift(mask, axes=(1, 2))
+        self.data_re_gridded = np.fft.fftshift(data_re_gridded, axes=(1, 2))
+        self.data_im_gridded = np.fft.fftshift(data_im_gridded, axes=(1, 2))
+        self.vis_gridded = self.data_re_gridded + self.data_im_gridded * 1.0j
+
+    def _estimate_cell_standard_deviation(
+        self,
+    ) -> tuple[npt.NDArray[np.floating[Any]], npt.NDArray[np.floating[Any]]]:
         r"""
         Estimate the `standard deviation
         <https://en.wikipedia.org/wiki/Standard_deviation>`__ of the real and imaginary
@@ -462,7 +494,9 @@ def _estimate_cell_standard_deviation(self):
 
         return s_re, s_im
 
-    def _check_scatter_error(self, max_scatter=1.2):
+    def _check_scatter_error(
+        self, max_scatter: float = 1.2
+    ) -> dict[str, bool | np.floating[Any]]:
         """
         Checks/compares visibility scatter to a given threshold value ``max_scatter``
         and raises an AssertionError if the median scatter across all cells exceeds
@@ -568,42 +602,6 @@ class DataAverager(GridderBase):
 
     """
 
-    def _grid_visibilities(self):
-        r"""
-        Average the loose data visibilities to the Fourier grid.
-        """
-
-        # create the cells as edges around the existing points
-        # note that at this stage, the UV grid is strictly increasing
-        # when in fact, later on, we'll need to fftshift for the FFT
-        cell_weight = self._sum_cell_values_cube(self.weight)
-
-        # boolean index for cells that *contain* visibilities
-        mask = cell_weight > 0.0
-
-        # calculate the density weights under "uniform"
-        # the density weights have the same shape as the re, im samples.
-        # cell_weight is (nchan, ncell_v, ncell_u)
-        # self.index_v, self.index_u are (nchan, nvis)
-        # we want density_weights to be (nchan, nvis)
-        density_weight = 1 / self._extract_gridded_values_to_loose(cell_weight)
-
-        # grid the reals and imaginaries separately
-        # outputs from _sum_cell_values_cube are *not* pre-packed
-        data_re_gridded = self._sum_cell_values_cube(
-            self.data_re * density_weight * self.weight
-        )
-
-        data_im_gridded = self._sum_cell_values_cube(
-            self.data_im * density_weight * self.weight
-        )
-
-        # store the pre-packed FFT products for access by outside routines
-        self.mask = np.fft.fftshift(mask, axes=(1, 2))
-        self.data_re_gridded = np.fft.fftshift(data_re_gridded, axes=(1, 2))
-        self.data_im_gridded = np.fft.fftshift(data_im_gridded, axes=(1, 2))
-        self.vis_gridded = self.data_re_gridded + self.data_im_gridded * 1.0j
-
     def _grid_weights(self):
         r"""
         Average the visibility weights to the Fourier grid contained in ``self.coords``,
diff --git a/src/mpol/images.py b/src/mpol/images.py
index 1b14b319..18133aa6 100644
--- a/src/mpol/images.py
+++ b/src/mpol/images.py
@@ -8,7 +8,7 @@
 import torch.fft  # to avoid conflicts with old torch.fft *function*
 from torch import nn
 
-from typing import Callable
+from typing import Any, Callable
 
 from mpol import utils
 from mpol.coordinates import GridCoords
@@ -52,7 +52,7 @@ def __init__(
         nchan: int = 1,
         pixel_mapping: Callable[[torch.Tensor], torch.Tensor] | None = None,
         base_cube: torch.Tensor | None = None,
-    ):
+    ) -> None:
         super().__init__()
 
         self.coords = coords
@@ -78,10 +78,10 @@ def __init__(
             self.base_cube = nn.Parameter(base_cube, requires_grad=True)
 
         if pixel_mapping is None:
-            self.pixel_mapping = torch.nn.Softplus()
+            self.pixel_mapping: Callable[
+                [torch.Tensor], torch.Tensor
+            ] = torch.nn.Softplus()
         else:
-            # TODO assert that this is a PyTorch function (and not a numpy function,
-            # for example)
             self.pixel_mapping = pixel_mapping
 
     def forward(self) -> torch.Tensor:
@@ -127,7 +127,7 @@ class HannConvCube(nn.Module):
         requires_grad (bool): keep kernel fixed
     """
 
-    def __init__(self, nchan, requires_grad=False):
+    def __init__(self, nchan: int, requires_grad: bool = False) -> None:
         super().__init__()
         # simple convolutional filter operates on per-channel basis
         # 3x3 Hann filter
@@ -159,7 +159,7 @@ def __init__(self, nchan, requires_grad=False):
             torch.zeros(nchan, dtype=torch.double), requires_grad=requires_grad
         )
 
-    def forward(self, cube):
+    def forward(self, cube: torch.Tensor) -> torch.Tensor:
         r"""Args:
             cube (torch.double tensor, of shape ``(nchan, npix, npix)``): a prepacked
             image cube, for example, from ImageCube.forward()
@@ -201,31 +201,33 @@ class ImageCube(nn.Module):
     since no transformations are applied to the ``cube`` tensor. The main purpose of
     the ImageCube layer is to provide useful functionality around the ``cube`` tensor,
     such as returning a sky_cube representation and providing FITS writing
-    functionility. In the case of ``passthrough==False``, the ImageCube layer also acts
+    functionality. In the case of ``passthrough==False``, the ImageCube layer also acts
     as a container for the trainable parameters.
 
-    Args:
-        cell_size (float): the width of a pixel [arcseconds]
-        npix (int): the number of pixels per image side
-        coords (GridCoords): an object already instantiated from the GridCoords class.
-            If providing this, cannot provide ``cell_size`` or ``npix``.
-        nchan (int): the number of channels in the image
-        passthrough (bool): if passthrough, assume ImageCube is just a layer as opposed
-            to parameter base.
-        cube (torch.double tensor, of shape ``(nchan, npix, npix)``): (optional) a
-            prepacked image cube to initialize the model with in units of
-            [:math:`\mathrm{Jy}\,\mathrm{arcsec}^{-2}`]. If None, assumes starting
-            ``cube`` is ``torch.zeros``. See :ref:`cube-orientation-label` for more
-            information on the expectations of the orientation of the input image.
+    Parameters
+    ----------
+    coords : :class:`mpol.coordinates.GridCoords`
+        an object instantiated from the GridCoords class, containing information about
+        the image `cell_size` and `npix`.
+    nchan : int
+        the number of channels in the base cube. Default = 1.
+    passthrough : bool
+        if `True`, assume ImageCube is just a layer as opposed
+        to parameter base.
+    cube : :class:torch.Tensor of :class:torch.double, of shape ``(nchan, npix, npix)``
+        a prepacked image cube to initialize the model with in units of
+        [:math:`\mathrm{Jy}\,\mathrm{arcsec}^{-2}`]. If None, assumes starting
+        ``cube`` is ``torch.zeros``. See :ref:`cube-orientation-label` for more
+        information on the expectations of the orientation of the input image.
     """
 
     def __init__(
         self,
-        coords=None,
-        nchan=1,
-        passthrough=False,
-        cube=None,
-    ):
+        coords: GridCoords,
+        nchan: int = 1,
+        passthrough: bool = False,
+        cube: torch.Tensor | None = None,
+    ) -> None:
         super().__init__()
 
         self.coords = coords
@@ -235,7 +237,7 @@ def __init__(
 
         if not self.passthrough:
             if cube is None:
-                self.cube = nn.Parameter(
+                self.cube : torch.nn.Parameter = nn.Parameter(
                     torch.full(
                         (self.nchan, self.coords.npix, self.coords.npix),
                         fill_value=0.0,
@@ -257,14 +259,7 @@ def __init__(
             # an initialization argument
             self.cube = None
 
-    @classmethod
-    def from_image_properties(
-        cls, cell_size, npix, nchan=1, passthrough=False, cube=None
-    ) -> ImageCube:
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, nchan, passthrough, cube)
-
-    def forward(self, cube=None):
+    def forward(self, cube: torch.Tensor | None = None) -> torch.Tensor:
         r"""
         If the ImageCube object was initialized with ``passthrough=True``, the ``cube``
         argument is required. ``forward`` essentially just passes this on as an identity
@@ -294,7 +289,7 @@ def forward(self, cube=None):
         return self.cube
 
     @property
-    def sky_cube(self):
+    def sky_cube(self) -> torch.Tensor:
         """
         The image cube arranged as it would appear on the sky.
 
@@ -305,7 +300,7 @@ def sky_cube(self):
         return utils.packed_cube_to_sky_cube(self.cube)
 
     @property
-    def flux(self):
+    def flux(self) -> torch.Tensor:
         """
         The spatially-integrated flux of the image. Returns a 'spectrum' with the flux
         in each channel in units of Jy.
@@ -318,7 +313,12 @@ def flux(self):
         # multiply by arcsec^2/pixel
         return self.coords.cell_size**2 * torch.sum(self.cube, dim=(1, 2))
 
-    def to_FITS(self, fname="cube.fits", overwrite=False, header_kwargs=None):
+    def to_FITS(
+        self,
+        fname: str = "cube.fits",
+        overwrite: bool = False,
+        header_kwargs: dict | None = None,
+    ) -> None:
         """
         Export the image cube to a FITS file.
 
@@ -330,15 +330,10 @@ def to_FITS(self, fname="cube.fits", overwrite=False, header_kwargs=None):
         Returns:
             None
         """
-
-        try:
-            from astropy import wcs
-            from astropy.io import fits
-        except ImportError:
-            print(
-                "Please install the astropy package to use FITS export functionality."
-            )
-
+        
+        from astropy import wcs
+        from astropy.io import fits
+        
         w = wcs.WCS(naxis=2)
 
         w.wcs.crpix = np.array([1, 1])
diff --git a/src/mpol/precomposed.py b/src/mpol/precomposed.py
index c7e3b539..35c2f6f6 100644
--- a/src/mpol/precomposed.py
+++ b/src/mpol/precomposed.py
@@ -57,11 +57,6 @@ def __init__(
         )
         self.fcube = fourier.FourierCube(coords=self.coords)
 
-    @classmethod
-    def from_image_properties(cls, cell_size, npix, nchan, base_cube):
-        coords = GridCoords(cell_size, npix)
-        return cls(coords, nchan, base_cube)
-
     def forward(self):
         r"""
         Feed forward to calculate the model visibilities. In this step, a 
diff --git a/test/gridder_dataset_export_test.py b/test/gridder_dataset_export_test.py
index 6769b662..d7884d1d 100644
--- a/test/gridder_dataset_export_test.py
+++ b/test/gridder_dataset_export_test.py
@@ -10,9 +10,9 @@
 def averager(mock_visibility_data):
     uu, vv, weight, data_re, data_im = mock_visibility_data
 
-    return gridding.DataAverager.from_image_properties(
-        cell_size=0.005,
-        npix=800,
+    coords = coordinates.GridCoords(cell_size=0.005, npix=800)
+    return gridding.DataAverager(
+        coords=coords,
         uu=uu,
         vv=vv,
         weight=weight,
diff --git a/test/gridder_gridding_test.py b/test/gridder_gridding_test.py
index aa6db553..ce59e009 100644
--- a/test/gridder_gridding_test.py
+++ b/test/gridder_gridding_test.py
@@ -14,9 +14,13 @@ def test_average_cont(mock_visibility_data_cont):
     """
     uu, vv, weight, data_re, data_im = mock_visibility_data_cont
 
-    averager = gridding.DataAverager.from_image_properties(
+    coords = coordinates.GridCoords(
         cell_size=0.005,
         npix=800,
+    )
+
+    averager = gridding.DataAverager(
+        coords=coords,
         uu=uu,
         vv=vv,
         weight=weight,
@@ -57,7 +61,10 @@ def test_uniform_ones(mock_visibility_data, tmp_path):
     averager._grid_visibilities()
 
     im = plt.imshow(
-        averager.ground_cube[4].real, origin="lower", extent=averager.coords.vis_ext, interpolation="none"
+        averager.ground_cube[4].real,
+        origin="lower",
+        extent=averager.coords.vis_ext,
+        interpolation="none",
     )
     plt.colorbar(im)
     plt.savefig(tmp_path / "gridded_re.png", dpi=300)
@@ -65,20 +72,23 @@ def test_uniform_ones(mock_visibility_data, tmp_path):
     plt.figure()
 
     im2 = plt.imshow(
-        averager.ground_cube[4].imag, origin="lower", extent=averager.coords.vis_ext, interpolation="none"
+        averager.ground_cube[4].imag,
+        origin="lower",
+        extent=averager.coords.vis_ext,
+        interpolation="none",
     )
     plt.colorbar(im2)
     plt.savefig(tmp_path / "gridded_im.png", dpi=300)
 
     plt.close("all")
 
-    # if the gridding worked, 
+    # if the gridding worked,
     # cells with no data should be 0
     assert averager.data_re_gridded[~averager.mask] == pytest.approx(0)
-    
+
     # and cells with data should have real values approximately 1
     assert averager.data_re_gridded[averager.mask] == pytest.approx(1)
-    
+
     # and imaginary values approximately 0 everywhere
     assert averager.data_im_gridded == pytest.approx(0)
 
@@ -91,9 +101,9 @@ def test_weight_gridding(mock_visibility_data):
     data_re = np.ones_like(uu)
     data_im = np.ones_like(uu)
 
-    averager = gridding.DataAverager.from_image_properties(
-        cell_size=0.005,
-        npix=800,
+    coords = coordinates.GridCoords(cell_size=0.005, npix=800)
+    averager = gridding.DataAverager(
+        coords=coords,
         uu=uu,
         vv=vv,
         weight=weight,
diff --git a/test/gridder_imager_test.py b/test/gridder_imager_test.py
index 1a08098c..96346ed9 100644
--- a/test/gridder_imager_test.py
+++ b/test/gridder_imager_test.py
@@ -14,9 +14,9 @@
 def imager(mock_visibility_data):
     uu, vv, weight, data_re, data_im = mock_visibility_data
 
-    return gridding.DirtyImager.from_image_properties(
-        cell_size=0.005,
-        npix=800,
+    coords = coordinates.GridCoords(cell_size=0.005, npix=800)
+    return gridding.DirtyImager(
+        coords=coords,
         uu=uu,
         vv=vv,
         weight=weight,
diff --git a/test/gridder_init_test.py b/test/gridder_init_test.py
index 7be0f2ac..464e81f7 100644
--- a/test/gridder_init_test.py
+++ b/test/gridder_init_test.py
@@ -31,9 +31,11 @@ def test_hermitian_pairs(mock_visibility_data):
 def test_averager_instantiate_cell_npix(mock_visibility_data):
     uu, vv, weight, data_re, data_im = mock_visibility_data
 
-    gridding.DataAverager.from_image_properties(
+    coords = coordinates.GridCoords(
         cell_size=0.005,
-        npix=800,
+        npix=800
+    )
+    gridding.DataAverager(coords=coords,
         uu=uu,
         vv=vv,
         weight=weight,
diff --git a/test/images_test.py b/test/images_test.py
index 882cb257..eab22fca 100644
--- a/test/images_test.py
+++ b/test/images_test.py
@@ -3,37 +3,18 @@
 import torch
 from astropy.io import fits
 
-from mpol import images, utils
+from mpol import coordinates, images, utils
 from mpol.constants import *
 
-
-def test_odd_npix():
-    expected_error_message = "Image must have an even number of pixels."
-
-    with pytest.raises(ValueError, match=expected_error_message):
-        images.BaseCube.from_image_properties(npix=853, nchan=30, cell_size=0.015)
-
-    with pytest.raises(ValueError, match=expected_error_message):
-        images.ImageCube.from_image_properties(npix=853, nchan=30, cell_size=0.015)
-
-
-def test_negative_cell_size():
-    expected_error_message = "cell_size must be a positive real number."
-
-    with pytest.raises(ValueError, match=expected_error_message):
-        images.BaseCube.from_image_properties(npix=800, nchan=30, cell_size=-0.015)
-
-    with pytest.raises(ValueError, match=expected_error_message):
-        images.ImageCube.from_image_properties(npix=800, nchan=30, cell_size=-0.015)
-
-
 def test_single_chan():
-    im = images.ImageCube.from_image_properties(cell_size=0.015, npix=800)
+    coords = coordinates.GridCoords(cell_size=0.015, npix=800)
+    im = images.ImageCube(coords=coords)
     assert im.nchan == 1
 
 
 def test_basecube_grad():
-    bcube = images.BaseCube.from_image_properties(npix=800, cell_size=0.015)
+    coords = coordinates.GridCoords(cell_size=0.015, npix=800)
+    bcube = images.BaseCube(coords=coords)
     loss = torch.sum(bcube())
     loss.backward()
 
@@ -189,7 +170,8 @@ def test_multi_chan_conv(coords, tmp_path):
 
     conv_layer(test_cube)
 
+
 def test_image_flux(coords):
     nchan = 20
-    im = images.ImageCube(coords=coords, nchan=nchan)    
+    im = images.ImageCube(coords=coords, nchan=nchan)
     assert im.flux.size()[0] == nchan

From f8e401857b559db1556888f6152ec9e5a83e89f9 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 14:53:01 +0000
Subject: [PATCH 20/26] removing redundant tensor calls from init in
 GriddedDataset

---
 src/mpol/datasets.py |  6 +++---
 src/mpol/gridding.py | 11 +++++++----
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/mpol/datasets.py b/src/mpol/datasets.py
index 29c90afb..1817dac2 100644
--- a/src/mpol/datasets.py
+++ b/src/mpol/datasets.py
@@ -58,9 +58,9 @@ def __init__(
         self.nchan = nchan
 
         # store variables as buffers of the module
-        self.register_buffer("vis_gridded", torch.tensor(vis_gridded))
-        self.register_buffer("weight_gridded", torch.tensor(weight_gridded))
-        self.register_buffer("mask", torch.tensor(mask))
+        self.register_buffer("vis_gridded", vis_gridded)
+        self.register_buffer("weight_gridded", weight_gridded)
+        self.register_buffer("mask", mask)
         self.vis_gridded: torch.Tensor
         self.weight_gridded: torch.Tensor
         self.mask: torch.Tensor
diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index ea067788..0c7498fd 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -532,7 +532,7 @@ def _fliplr_cube(self, cube):
         return cube[:, :, ::-1]
 
     @property
-    def ground_cube(self):
+    def ground_cube(self) -> npt.NDArray[np.floating[Any]]:
         r"""
         The visibility FFT cube fftshifted for plotting with ``imshow``.
 
@@ -602,7 +602,7 @@ class DataAverager(GridderBase):
 
     """
 
-    def _grid_weights(self):
+    def _grid_weights(self) -> None:
         r"""
         Average the visibility weights to the Fourier grid contained in ``self.coords``,
         such that the ``self.weight_gridded`` corresponds to the equivalent weight on
@@ -617,9 +617,12 @@ def _grid_weights(self):
         # instantiate uncertainties for each averaged visibility.
         self.weight_gridded = np.fft.fftshift(cell_weight, axes=(1, 2))
 
-    def to_pytorch_dataset(self, check_visibility_scatter=True, max_scatter=1.2):
+    def to_pytorch_dataset(
+        self, check_visibility_scatter: bool = True, max_scatter: float = 1.2
+    ) -> GriddedDataset:
         r"""
-        Export gridded visibilities to a PyTorch dataset object.
+        Export gridded visibilities to a PyTorch :class:`mpol.datasets.GriddedDataset`
+        object.
 
         Args:
             check_visibility_scatter (bool): whether the routine should check the

From 386108cde46e8577958e04b92dcba813e37c3ef8 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 15:01:19 +0000
Subject: [PATCH 21/26] moved GriddedDataset export to from_numpy and tests
 pass.

---
 src/mpol/gridding.py | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index 0c7498fd..4b157ea7 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -12,6 +12,8 @@
 import numpy.typing as npt
 from fast_histogram import histogram as fast_hist
 
+import torch
+
 from mpol.coordinates import GridCoords
 from mpol.exceptions import DataError, ThresholdExceededError, WrongDimensionError
 from mpol.datasets import GriddedDataset
@@ -657,9 +659,9 @@ def to_pytorch_dataset(
         return GriddedDataset(
             coords=self.coords,
             nchan=self.nchan,
-            vis_gridded=self.vis_gridded,
-            weight_gridded=self.weight_gridded,
-            mask=self.mask,
+            vis_gridded=torch.from_numpy(self.vis_gridded),
+            weight_gridded=torch.from_numpy(self.weight_gridded),
+            mask=torch.from_numpy(self.mask),
         )
 
 

From 6697a35f822b1d9fb1a0ddea33c0f778a31c1509 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 15:28:18 +0000
Subject: [PATCH 22/26] typed gridding.

---
 src/mpol/gridding.py | 82 +++++++++++++++++++++++++++-----------------
 1 file changed, 51 insertions(+), 31 deletions(-)

diff --git a/src/mpol/gridding.py b/src/mpol/gridding.py
index 4b157ea7..8d674bd5 100644
--- a/src/mpol/gridding.py
+++ b/src/mpol/gridding.py
@@ -6,7 +6,7 @@
 
 import warnings
 
-from typing import Any, Sequence
+from typing import Any, Callable, Sequence
 
 import numpy as np
 import numpy.typing as npt
@@ -724,13 +724,13 @@ class DirtyImager(GridderBase):
 
     def __init__(
         self,
-        coords=None,
-        uu=None,
-        vv=None,
-        weight=None,
-        data_re=None,
-        data_im=None,
-    ):
+        coords: GridCoords,
+        uu: npt.NDArray[np.floating[Any]],
+        vv: npt.NDArray[np.floating[Any]],
+        weight: npt.NDArray[np.floating[Any]],
+        data_re: npt.NDArray[np.floating[Any]],
+        data_im: npt.NDArray[np.floating[Any]],
+    ) -> None:
         # check everything should be 2d, expand if not
         # also checks data does not contain Hermitian pairs
         uu, vv, weight, data_re, data_im = _check_data_inputs_2d(
@@ -759,31 +759,35 @@ def __init__(
         self._create_cell_indices()
 
     @property
-    def uu(self):
+    def uu(self) -> npt.NDArray[np.floating[Any]]:
         return np.concatenate([self.uu_base, -self.uu_base], axis=1)
 
     @property
-    def vv(self):
+    def vv(self) -> npt.NDArray[np.floating[Any]]:
         return np.concatenate([self.vv_base, -self.vv_base], axis=1)
 
     @property
-    def weight(self):
+    def weight(self) -> npt.NDArray[np.floating[Any]]:
         return np.concatenate([self.weight_base, self.weight_base], axis=1)
 
     @property
-    def data_re(self):
+    def data_re(self) -> npt.NDArray[np.floating[Any]]:
         return np.concatenate([self.data_re_base, self.data_re_base], axis=1)
 
     @property
-    def data_im(self):
+    def data_im(self) -> npt.NDArray[np.floating[Any]]:
         return np.concatenate([self.data_im_base, -self.data_im_base], axis=1)
 
     def _grid_visibilities(
         self,
-        weighting="uniform",
-        robust=None,
-        taper_function=None,
-    ):
+        weighting: str = "uniform",
+        robust: float | None = None,
+        taper_function: Callable[
+            [npt.NDArray[np.floating[Any]], npt.NDArray[np.floating[Any]]],
+            npt.NDArray[np.floating[Any]],
+        ]
+        | None = None,
+    ) -> None:
         r"""
         Grid the loose data visibilities to the Fourier grid in preparation for imaging.
 
@@ -796,8 +800,9 @@ def _grid_visibilities(
                 weighting and ``robust=2`` approximately corresponds to natural
                 weighting.
             taper_function (function reference): a function assumed to be of the form
-                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]`
-                and premultiplies the visibility data. The function must assume that
+                :math:`f(u,v)` which calculates a prefactor in the range :math:`[0,1]`.
+                This prefactor is used to premultiply the visibility data as a taper.
+                The tapering function must assume that
                 :math:`u` and :math:`v` will be supplied in units of
                 :math:`\mathrm{k}\lambda`. By default no taper is applied.
         """
@@ -885,7 +890,11 @@ def _grid_visibilities(
         self.vis_gridded = self.data_re_gridded + self.data_im_gridded * 1.0j
         self.re_gridded_beam = np.fft.fftshift(re_gridded_beam, axes=(1, 2))
 
-    def _get_dirty_beam(self, C, re_gridded_beam):
+    def _get_dirty_beam(
+        self,
+        C: npt.NDArray[np.floating[Any]],
+        re_gridded_beam: npt.NDArray[np.floating[Any]],
+    ) -> npt.NDArray[np.floating[Any]]:
         """
         Compute the dirty beam corresponding to the gridded visibilities.
 
@@ -919,11 +928,13 @@ def _get_dirty_beam(self, C, re_gridded_beam):
                 " visibilities, otherwise raise a github issue."
             )
 
-        self.beam = beam.real
+        self.beam: npt.NDArray[np.floating[Any]] = beam.real
 
         return self.beam
 
-    def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
+    def _null_dirty_beam(
+        self, ntheta: int = 24, single_channel_estimate: bool = True
+    ) -> npt.NDArray[np.floating[Any]]:
         r"""Zero out (null) all pixels in the dirty beam exterior to the first null,
         for each channel.
 
@@ -996,7 +1007,9 @@ def _null_dirty_beam(self, ntheta=24, single_channel_estimate=True):
 
         return nulled_beam
 
-    def get_dirty_beam_area(self, ntheta=24, single_channel_estimate=True):
+    def get_dirty_beam_area(
+        self, ntheta: int = 24, single_channel_estimate: bool = True
+    ) -> npt.NDArray[np.floating[Any]]:
         r"""
         Compute the effective area of the dirty beam for each channel. Assumes that the
         beam has already been generated by running
@@ -1022,18 +1035,25 @@ def get_dirty_beam_area(self, ntheta=24, single_channel_estimate=True):
         nulled = self._null_dirty_beam(
             ntheta=ntheta, single_channel_estimate=single_channel_estimate
         )
-        return self.coords.cell_size**2 * np.sum(nulled, axis=(1, 2))  # arcsec^2
+        area: npt.NDArray[np.floating[Any]] = self.coords.cell_size**2 * np.sum(
+            nulled, axis=(1, 2)
+        )  # arcsec^2
+        return area
 
     def get_dirty_image(
         self,
-        weighting="uniform",
-        robust=None,
-        taper_function=None,
-        unit="Jy/beam",
-        check_visibility_scatter=True,
-        max_scatter=1.2,
+        weighting: str = "uniform",
+        robust: float | None = None,
+        taper_function: Callable[
+            [npt.NDArray[np.floating[Any]], npt.NDArray[np.floating[Any]]],
+            npt.NDArray[np.floating[Any]],
+        ]
+        | None = None,
+        unit: str = "Jy/beam",
+        check_visibility_scatter: bool = True,
+        max_scatter: float = 1.2,
         **beam_kwargs,
-    ):
+    ) -> tuple[npt.NDArray[np.floating[Any]], npt.NDArray[np.floating[Any]]]:
         r"""
         Calculate the dirty image.
 

From f847e0d74de8c32c5849951ab6a2255da0fa4769 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 21:37:16 +0000
Subject: [PATCH 23/26] Tests pass after removing passthrough.

---
 pyproject.toml          | 10 +++---
 src/mpol/fourier.py     |  2 +-
 src/mpol/images.py      | 79 +++++++++--------------------------------
 src/mpol/precomposed.py | 18 +++++-----
 test/conftest.py        | 14 ++++++--
 test/connectors_test.py |  4 +--
 test/fourier_test.py    |  9 +++--
 test/images_test.py     |  8 +++--
 test/losses_test.py     | 48 +++++++++----------------
 test/train_test_test.py |  2 +-
 10 files changed, 74 insertions(+), 120 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index af5286e4..365a3bb1 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -91,10 +91,12 @@ ignore_missing_imports = true
 
 [[tool.mypy.overrides]]
 module = [
-    "MPol.constants",
-    "MPoL.losses",
+    "MPoL.constants",
+    "MPoL.coordinates",
+    "MPoL.datasets",
     # "MPoL.fourier", # once we remove get_vis_residuals
+    "MPoL.images",
+    "MPoL.losses"
     # "MPoL.utils", # once we fix check_baselines
-    "MPoL.datasets"
-]
+    ]
 disallow_untyped_defs = true
\ No newline at end of file
diff --git a/src/mpol/fourier.py b/src/mpol/fourier.py
index 706d3c67..ca1dbafd 100644
--- a/src/mpol/fourier.py
+++ b/src/mpol/fourier.py
@@ -852,7 +852,7 @@ def get_vis_residuals(model, u_true, v_true, V_true, return_Vmod=False, channel=
     """
     nufft = NuFFT(coords=model.coords, nchan=model.nchan)
 
-    vis_model = nufft(model.icube().to("cpu"), u_true, v_true)  # TODO: remove 'to' call
+    vis_model = nufft(model.icube.cube.to("cpu"), u_true, v_true)  # TODO: remove 'to' call
     # convert to numpy, select channel
     vis_model = vis_model.detach().numpy()[channel]
 
diff --git a/src/mpol/images.py b/src/mpol/images.py
index 18133aa6..628fd55f 100644
--- a/src/mpol/images.py
+++ b/src/mpol/images.py
@@ -61,9 +61,8 @@ def __init__(
         # The ``base_cube`` is already packed to make the Fourier transformation easier
         if base_cube is None:
             self.base_cube = nn.Parameter(
-                torch.full(
+                torch.zeros(
                     (self.nchan, self.coords.npix, self.coords.npix),
-                    fill_value=0.05,
                     requires_grad=True,
                     dtype=torch.double,
                 )
@@ -211,80 +210,36 @@ class ImageCube(nn.Module):
         the image `cell_size` and `npix`.
     nchan : int
         the number of channels in the base cube. Default = 1.
-    passthrough : bool
-        if `True`, assume ImageCube is just a layer as opposed
-        to parameter base.
-    cube : :class:torch.Tensor of :class:torch.double, of shape ``(nchan, npix, npix)``
-        a prepacked image cube to initialize the model with in units of
-        [:math:`\mathrm{Jy}\,\mathrm{arcsec}^{-2}`]. If None, assumes starting
-        ``cube`` is ``torch.zeros``. See :ref:`cube-orientation-label` for more
-        information on the expectations of the orientation of the input image.
     """
 
     def __init__(
         self,
         coords: GridCoords,
         nchan: int = 1,
-        passthrough: bool = False,
-        cube: torch.Tensor | None = None,
     ) -> None:
         super().__init__()
 
         self.coords = coords
         self.nchan = nchan
+        self.register_buffer("cube", None)
 
-        self.passthrough = passthrough
-
-        if not self.passthrough:
-            if cube is None:
-                self.cube : torch.nn.Parameter = nn.Parameter(
-                    torch.full(
-                        (self.nchan, self.coords.npix, self.coords.npix),
-                        fill_value=0.0,
-                        requires_grad=True,
-                        dtype=torch.double,
-                    )
-                )
-
-            else:
-                # We expect the user to supply a pre-packed base cube
-                # so that it's ready to go for the FFT
-                # We could apply this transformation for the user, but I think it will
-                # lead to less confusion if we make this transformation explicit
-                # for the user during the setup phase.
-                self.cube = nn.Parameter(cube)
-        else:
-            # ImageCube is working as a passthrough layer, so cube should
-            # only be provided as an arg to the forward method, not as
-            # an initialization argument
-            self.cube = None
-
-    def forward(self, cube: torch.Tensor | None = None) -> torch.Tensor:
+    def forward(self, cube: torch.Tensor) -> torch.Tensor:
         r"""
-        If the ImageCube object was initialized with ``passthrough=True``, the ``cube``
-        argument is required. ``forward`` essentially just passes this on as an identity
-        operation.
-
-        If the ImageCube object was initialized with ``passthrough=False``, the ``cube``
-        argument is not permitted, and ``forward`` passes on the stored
-        ``nn.Parameter`` cube as an identity operation.
-
-        Args:
-            cube (3D torch tensor of shape ``(nchan, npix, npix)``): only permitted if
-            the ImageCube object was initialized with ``passthrough=True``.
-
-        Returns: (3D torch.double tensor of shape ``(nchan, npix, npix)``) as identity
-        operation
+        Pass the cube through as an identity operation, storing the value to the 
+        internal buffer. After the cube has been passed through, convenience 
+        instance attributes like `sky_cube` and `flux` will reflect the updated cube.
+
+        Parameters
+        ----------
+        cube : :class:`torch.Tensor` of type :class:`torch.double` 
+            3D torch tensor of shape ``(nchan, npix, npix)``) in 'packed' format
+
+        Returns
+        -------
+        :class:`torch.Tensor` of :class:`torch.double` type
+            tensor of shape ``(nchan, npix, npix)``), same as `cube`
         """
-
-        if cube is not None:
-            assert (
-                self.passthrough
-            ), "ImageCube.passthrough must be True if supplying cube."
-            self.cube = cube
-
-        if not self.passthrough:
-            assert cube is None, "Do not supply cube if ImageCube.passthrough == False."
+        self.cube = cube
 
         return self.cube
 
diff --git a/src/mpol/precomposed.py b/src/mpol/precomposed.py
index 35c2f6f6..227b3e3d 100644
--- a/src/mpol/precomposed.py
+++ b/src/mpol/precomposed.py
@@ -13,10 +13,10 @@ class SimpleNet(torch.nn.Module):
     Args:
         cell_size (float): the width of a pixel [arcseconds]
         npix (int): the number of pixels per image side
-        coords (GridCoords): an object already instantiated from the GridCoords class. 
+        coords (GridCoords): an object already instantiated from the GridCoords class.
             If providing this, cannot provide ``cell_size`` or ``npix``.
         nchan (int): the number of channels in the base cube. Default = 1.
-        base_cube : a pre-packed base cube to initialize the model with. If 
+        base_cube : a pre-packed base cube to initialize the model with. If
             None, assumes ``torch.zeros``.
 
     After the object is initialized, instance variables can be accessed, for example
@@ -25,10 +25,10 @@ class SimpleNet(torch.nn.Module):
     :ivar icube: the :class:`~mpol.images.ImageCube` instance
     :ivar fcube: the :class:`~mpol.fourier.FourierCube` instance
 
-    For example, you'll likely want to access the ``self.icube.sky_model`` 
+    For example, you'll likely want to access the ``self.icube.sky_model``
     at some point.
 
-    The idea is that :class:`~mpol.precomposed.SimpleNet` can save you some keystrokes 
+    The idea is that :class:`~mpol.precomposed.SimpleNet` can save you some keystrokes
     composing models by connecting the most commonly used layers together.
 
     .. mermaid:: _static/mmd/src/SimpleNet.mmd
@@ -52,16 +52,14 @@ def __init__(
 
         self.conv_layer = images.HannConvCube(nchan=self.nchan)
 
-        self.icube = images.ImageCube(
-            coords=self.coords, nchan=self.nchan, passthrough=True
-        )
+        self.icube = images.ImageCube(coords=self.coords, nchan=self.nchan)
         self.fcube = fourier.FourierCube(coords=self.coords)
 
     def forward(self):
         r"""
-        Feed forward to calculate the model visibilities. In this step, a 
-        :class:`~mpol.images.BaseCube` is fed to a :class:`~mpol.images.HannConvCube` 
-        is fed to a :class:`~mpol.images.ImageCube` is fed to a 
+        Feed forward to calculate the model visibilities. In this step, a
+        :class:`~mpol.images.BaseCube` is fed to a :class:`~mpol.images.HannConvCube`
+        is fed to a :class:`~mpol.images.ImageCube` is fed to a
         :class:`~mpol.fourier.FourierCube` to produce the visibility cube.
 
         Returns: 1D complex torch tensor of model visibilities.
diff --git a/test/conftest.py b/test/conftest.py
index d63d3377..b52b09cc 100644
--- a/test/conftest.py
+++ b/test/conftest.py
@@ -149,7 +149,10 @@ def mock_1d_image_model(mock_1d_archive):
     # pack the numpy image array into an ImageCube
     packed_cube = np.broadcast_to(i2dtrue, (1, coords.npix, coords.npix)).copy()
     packed_tensor = torch.from_numpy(packed_cube)
-    cube_true = images.ImageCube(coords=coords, nchan=1, cube=packed_tensor)
+    bcube = images.BaseCube(coords=coords,nchan=1,base_cube=packed_tensor,pixel_mapping=lambda x: x)
+    cube_true = images.ImageCube(coords=coords, nchan=1)
+    # register cube to buffer inside cube_true.cube
+    cube_true(bcube())
 
     return rtrue, itrue, cube_true, xmax, ymax, geom
 
@@ -176,8 +179,13 @@ def mock_1d_vis_model(mock_1d_archive):
     # pack the numpy image array into an ImageCube
     packed_cube = np.broadcast_to(i2dtrue, (1, coords.npix, coords.npix)).copy()
     packed_tensor = torch.from_numpy(packed_cube)
-    cube_true = images.ImageCube(coords=coords, nchan=1, cube=packed_tensor)
-    
+    bcube = images.BaseCube(coords=coords,nchan=1, base_cube=packed_tensor, pixel_mapping=lambda x:x)
+    cube_true = images.ImageCube(coords=coords, nchan=1)
+
+    # register image 
+    cube_true(bcube())
+
+
     # create a FourierCube
     fcube_true = fourier.FourierCube(coords=coords)    
 
diff --git a/test/connectors_test.py b/test/connectors_test.py
index 5dcb801d..e0303983 100644
--- a/test/connectors_test.py
+++ b/test/connectors_test.py
@@ -27,7 +27,7 @@ def test_index(coords, dataset):
     basecube = images.BaseCube(coords=coords, nchan=nchan, base_cube=base_cube)
 
     # try passing through ImageLayer
-    imagecube = images.ImageCube(coords=coords, nchan=nchan, passthrough=True)
+    imagecube = images.ImageCube(coords=coords, nchan=nchan)
 
     # produce dense model visibility cube
     modelVisibilityCube = flayer(imagecube(basecube()))
@@ -42,7 +42,7 @@ def test_connector_grad(coords, dataset):
     flayer = fourier.FourierCube(coords=coords)
     nchan = dataset.nchan
     basecube = images.BaseCube(coords=coords, nchan=nchan)
-    imagecube = images.ImageCube(coords=coords, nchan=nchan, passthrough=True)
+    imagecube = images.ImageCube(coords=coords, nchan=nchan)
 
     # produce model visibilities
     modelVisibilityCube = flayer(imagecube(basecube()))
diff --git a/test/fourier_test.py b/test/fourier_test.py
index 39cb1fbb..40e91969 100644
--- a/test/fourier_test.py
+++ b/test/fourier_test.py
@@ -142,7 +142,8 @@ def test_predict_vis_nufft(coords, mock_visibility_data_cont):
 
     nchan = 10
 
-    # instantiate an ImageCube layer filled with zeros
+    # instantiate an BaseCube layer filled with zeros
+    basecube = images.BaseCube(coords=coords, nchan=nchan, pixel_mapping=lambda x: x)
     imagecube = images.ImageCube(coords=coords, nchan=nchan)
 
     # we have a multi-channel cube, but only sent single-channel uu and vv
@@ -151,7 +152,7 @@ def test_predict_vis_nufft(coords, mock_visibility_data_cont):
     layer = fourier.NuFFT(coords=coords, nchan=nchan)
 
     # predict the values of the cube at the u,v locations
-    output = layer(imagecube(), uu, vv)
+    output = layer(imagecube(basecube()), uu, vv)
 
     # make sure we got back the number of visibilities we expected
     assert output.shape == (nchan, len(uu))
@@ -172,6 +173,8 @@ def test_predict_vis_nufft_cached(coords, mock_visibility_data_cont):
     nchan = 10
 
     # instantiate an ImageCube layer filled with zeros
+    # instantiate an BaseCube layer filled with zeros
+    basecube = images.BaseCube(coords=coords, nchan=nchan, pixel_mapping=lambda x: x)
     imagecube = images.ImageCube(coords=coords, nchan=nchan)
 
     # we have a multi-channel cube, but sent only single-channel uu and vv
@@ -180,7 +183,7 @@ def test_predict_vis_nufft_cached(coords, mock_visibility_data_cont):
     layer = fourier.NuFFTCached(coords=coords, nchan=nchan, uu=uu, vv=vv)
 
     # predict the values of the cube at the u,v locations
-    output = layer(imagecube())
+    output = layer(imagecube(basecube()))
 
     # make sure we got back the number of visibilities we expected
     assert output.shape == (nchan, len(uu))
diff --git a/test/images_test.py b/test/images_test.py
index eab22fca..b365c763 100644
--- a/test/images_test.py
+++ b/test/images_test.py
@@ -22,7 +22,7 @@ def test_basecube_grad():
 def test_imagecube_grad(coords):
     bcube = images.BaseCube(coords=coords)
     # try passing through ImageLayer
-    imagecube = images.ImageCube(coords=coords, passthrough=True)
+    imagecube = images.ImageCube(coords=coords)
 
     # send things through this layer
     loss = torch.sum(imagecube(bcube()))
@@ -36,7 +36,7 @@ def test_imagecube_tofits(coords, tmp_path):
     bcube = images.BaseCube(coords=coords)
 
     # try passing through ImageLayer
-    imagecube = images.ImageCube(coords=coords, passthrough=True)
+    imagecube = images.ImageCube(coords=coords)
 
     # sending the basecube through the imagecube
     imagecube(bcube())
@@ -88,7 +88,7 @@ def test_basecube_imagecube(coords, tmp_path):
     fig.savefig(tmp_path / "basecube_mapped.png", dpi=300)
 
     # try passing through ImageLayer
-    imagecube = images.ImageCube(coords=coords, nchan=nchan, passthrough=True)
+    imagecube = images.ImageCube(coords=coords, nchan=nchan)
 
     # send things through this layer
     imagecube(basecube())
@@ -173,5 +173,7 @@ def test_multi_chan_conv(coords, tmp_path):
 
 def test_image_flux(coords):
     nchan = 20
+    bcube = images.BaseCube(coords=coords, nchan=nchan)
     im = images.ImageCube(coords=coords, nchan=nchan)
+    im(bcube())
     assert im.flux.size()[0] == nchan
diff --git a/test/losses_test.py b/test/losses_test.py
index 908eb9e5..ca88a5c7 100644
--- a/test/losses_test.py
+++ b/test/losses_test.py
@@ -5,43 +5,29 @@
 from mpol import fourier, images, losses, utils
 
 
-# create a fixture that has an image and produces loose and gridded model visibilities
+# create a fixture that returns nchan and an image
 @pytest.fixture
-def image_cube(mock_visibility_data, coords):
-    # Gaussian parameters
-    kw = {
-        "a": 1,
-        "delta_x": 0.02,  # arcsec
-        "delta_y": -0.01,
-        "sigma_x": 0.02,
-        "sigma_y": 0.01,
-        "Omega": 20,  # degrees
-    }
-
+def nchan_cube(mock_visibility_data, coords):
     uu, vv, weight, data_re, data_im = mock_visibility_data
     nchan = len(uu)
 
-    # evaluate the Gaussian over the sky-plane, as np array
-    img_packed = utils.sky_gaussian_arcsec(
-        coords.packed_x_centers_2D, coords.packed_y_centers_2D, **kw
+    # create a mock base image
+    basecube = images.BaseCube(
+        coords=coords,
+        nchan=nchan,
     )
-
-    # broadcast to (nchan, npix, npix)
-    img_packed_cube = np.broadcast_to(
-        img_packed, (nchan, coords.npix, coords.npix)
-    ).copy()
-    # convert img_packed to pytorch tensor
-    img_packed_tensor = torch.from_numpy(img_packed_cube)
     # insert into ImageCube layer
+    imagecube = images.ImageCube(coords=coords, nchan=nchan)
+    packed_cube = imagecube(basecube())
 
-    return images.ImageCube(coords=coords, nchan=nchan, cube=img_packed_tensor)
+    return nchan, packed_cube
 
 
 @pytest.fixture
-def loose_visibilities(mock_visibility_data, image_cube):
+def loose_visibilities(mock_visibility_data, coords, nchan_cube):
     # use the NuFFT to produce model visibilities
 
-    nchan = image_cube.nchan
+    nchan, packed_cube = nchan_cube
 
     # use the coil broadcasting ability
     chan = 4
@@ -50,16 +36,16 @@ def loose_visibilities(mock_visibility_data, image_cube):
     uu_chan = uu[chan]
     vv_chan = vv[chan]
 
-    nufft = fourier.NuFFT(coords=image_cube.coords, nchan=nchan)
-    return nufft(image_cube(), uu_chan, vv_chan)
+    nufft = fourier.NuFFT(coords=coords, nchan=nchan)
+    return nufft(packed_cube, uu_chan, vv_chan)
 
 
 @pytest.fixture
-def gridded_visibilities(image_cube):
-
+def gridded_visibilities(coords, nchan_cube):
+    nchan, packed_cube = nchan_cube
     # use the FourierCube to produce model visibilities
-    flayer = fourier.FourierCube(coords=image_cube.coords)
-    return flayer(image_cube())
+    flayer = fourier.FourierCube(coords=coords)
+    return flayer(packed_cube)
 
 
 def test_chi_squared_evaluation(
diff --git a/test/train_test_test.py b/test/train_test_test.py
index 9826d9fc..04e6b66a 100644
--- a/test/train_test_test.py
+++ b/test/train_test_test.py
@@ -175,7 +175,7 @@ def test_train_to_dirty_image(coords, dataset, imager):
     train_to_dirty_image(model, imager, niter=10)
 
 
-def test_tensorboard(coords, dataset_cont, tmp_path):
+def test_tensorboard(coords, dataset_cont):
     # not using TrainTest class, 
     # set everything up to run on a single channel and then
     # test the writer function

From 4deab01c00cbec177fb92d5f0decab1834a98c77 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 22:09:14 +0000
Subject: [PATCH 24/26] nearly complete typing.

---
 pyproject.toml       |   2 +
 src/mpol/geometry.py | 167 +++++++++++++++++++++----------------------
 2 files changed, 83 insertions(+), 86 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index 365a3bb1..9b2ba7c8 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -95,6 +95,8 @@ module = [
     "MPoL.coordinates",
     "MPoL.datasets",
     # "MPoL.fourier", # once we remove get_vis_residuals
+    "MPoL.geometry",
+    # "MPoL.gridding", # once we sort check_data_inputs_2d
     "MPoL.images",
     "MPoL.losses"
     # "MPoL.utils", # once we fix check_baselines
diff --git a/src/mpol/geometry.py b/src/mpol/geometry.py
index fadfd5fa..27241c63 100644
--- a/src/mpol/geometry.py
+++ b/src/mpol/geometry.py
@@ -5,12 +5,18 @@
 import torch
 
 
-def flat_to_observer(x, y, omega=None, incl=None, Omega=None):
-    """Rotate the frame to convert a point in the flat (x,y,z) frame to observer frame 
+def flat_to_observer(
+    x: torch.Tensor,
+    y: torch.Tensor,
+    omega: float = 0.0,
+    incl: float = 0.0,
+    Omega: float = 0.0,
+) -> tuple[torch.Tensor, torch.Tensor]:
+    """Rotate the frame to convert a point in the flat (x,y,z) frame to observer frame
     (X,Y,Z).
 
     It is assumed that the +Z axis points *towards* the observer. It is assumed that the
-      model is flat in the (x,y) frame (like a flat disk), and so the operations 
+      model is flat in the (x,y) frame (like a flat disk), and so the operations
       involving ``z`` are neglected. But the model lives in 3D Cartesian space.
 
     In order,
@@ -19,72 +25,68 @@ def flat_to_observer(x, y, omega=None, incl=None, Omega=None):
     2. rotate about the x1 axis by an amount -incl -> x2, y2, z2
     3. rotate about the z2 axis by an amount Omega -> x3, y3, z3 = X, Y, Z
 
-    Inspired by `exoplanet/keplerian.py 
+    Inspired by `exoplanet/keplerian.py
     <https://github.com/exoplanet-dev/exoplanet/blob/main/src/exoplanet/orbits/keplerian.py>`_
 
-    Args:
-        x (torch tensor): A tensor representing the x coordinate in the plane of 
-            the orbit.
-        y (torch tensor): A tensor representing the y coordinate in the plane of 
-            the orbit.
-        omega (torch float tensor): A tensor representing an argument of periastron 
-            [radians] Default 0.0.
-        incl (torch float tensor): A tensor representing an inclination value [radians].
-            Default 0.0.
-        Omega (torch float tensor): A tensor representing the position angle of the 
-            ascending node in [radians]. Default 0.0
-
-    Returns:
+    Parameters
+    ----------
+    x : :class:`torch.Tensor` of :class:`torch.double`
+        A tensor representing the x coordinate in the plane of the orbit.
+    y : :class:`torch.Tensor` of :class:`torch.double`
+        A tensor representing the y coordinate in the plane of the orbit.
+    omega : float
+        Argument of periastron [radians]. Default 0.0.
+    incl : float
+        Inclination value [radians]. Default 0.0.
+    Omega : float
+        Position angle of the ascending node in [radians]. Default 0.0
+
+    Returns
+    -------
+    2-tuple of :class:`torch.Tensor` of :class:`torch.double`
         Two tensors representing ``(X, Y)`` in the observer frame.
     """
-    # Rotation matrices result in a *clockwise* rotation of the axes, 
+    # Rotation matrices result in a *clockwise* rotation of the axes,
     # as defined using the righthand rule.
-    # For example, looking down the z-axis, 
+    # For example, looking down the z-axis,
     # a positive angle will rotate the x,y axes clockwise.
-    # A vector in the coordinate system will appear as though it has been 
+    # A vector in the coordinate system will appear as though it has been
     # rotated counter-clockwise.
 
     # 1) rotate about the z0 axis by omega
-    if omega is not None:
-        cos_omega = torch.cos(torch.as_tensor(omega))
-        sin_omega = torch.sin(torch.as_tensor(omega))
+    cos_omega = np.cos(omega)
+    sin_omega = np.sin(omega)
 
-        x1 = cos_omega * x - sin_omega * y
-        y1 = sin_omega * x + cos_omega * y
-    else:
-        x1 = x
-        y1 = y
+    x1 = cos_omega * x - sin_omega * y
+    y1 = sin_omega * x + cos_omega * y
 
     # 2) rotate about x1 axis by -incl
     x2 = x1
-
-    if incl is not None:
-        y2 = torch.cos(torch.as_tensor(incl)) * y1
-        # z3 = z2, subsequent rotation by Omega doesn't affect it
-        # Z = -torch.sin(incl) * y1
-    else:
-        y2 = y1
-        # Z = 0.0
+    y2 = np.cos(incl) * y1
+    # z3 = z2, subsequent rotation by Omega doesn't affect it
+    # Z = -torch.sin(incl) * y1
 
     # 3) rotate about z2 axis by Omega
-    if Omega is not None:
-        cos_Omega = torch.cos(torch.as_tensor(Omega))
-        sin_Omega = torch.sin(torch.as_tensor(Omega))
+    cos_Omega = np.cos(Omega)
+    sin_Omega = np.sin(Omega)
 
-        X = cos_Omega * x2 - sin_Omega * y2
-        Y = sin_Omega * x2 + cos_Omega * y2
-    else:
-        X = x2
-        Y = y2
+    X = cos_Omega * x2 - sin_Omega * y2
+    Y = sin_Omega * x2 + cos_Omega * y2
 
     return X, Y
 
 
-def observer_to_flat(X, Y, omega=None, incl=None, Omega=None):
-    """Rotate the frame to convert a point in the observer frame (X,Y,Z) to the 
+def observer_to_flat(
+    X: torch.Tensor,
+    Y: torch.Tensor,
+    omega: float = 0.0,
+    incl: float = 0.0,
+    Omega: float = 0.0,
+) -> tuple[torch.Tensor, torch.Tensor]:
+    """Rotate the frame to convert a point in the observer frame (X,Y,Z) to the
     flat (x,y,z) frame.
 
-    It is assumed that the +Z axis points *towards* the observer. The rotation 
+    It is assumed that the +Z axis points *towards* the observer. The rotation
     operations are the inverse of the :func:`~mpol.geometry.flat_to_observer` operations.
 
     In order,
@@ -93,60 +95,53 @@ def observer_to_flat(X, Y, omega=None, incl=None, Omega=None):
     2. inverse rotation about the x2 axis by an amount -incl -> x1, y1, z1
     3. inverse rotation about the z1 axis by an amount omega -> x, y, z
 
-    Inspired by `exoplanet/keplerian.py 
+    Inspired by `exoplanet/keplerian.py
     <https://github.com/exoplanet-dev/exoplanet/blob/main/src/exoplanet/orbits/keplerian.py>`_
 
-    Args:
-        X (torch tensor): A tensor representing the x coodinate in the plane of 
-            the orbit.
-        Y (torch.tensor): A tensor representing the y coodinate in the plane of 
-            the orbit.
-        omega (torch float tensor): A tensor representing an argument of periastron 
-            [radians] Default 0.0.
-        incl (torch float tensor): A tensor representing an inclination value [radians].
-            Default 0.0.
-        Omega (torch float tensor): A tensor representing the position angle of the 
-            ascending node in [radians]. Default 0.0
-
-    Returns:
+    Parameters
+    ----------
+    X : :class:`torch.Tensor` of :class:`torch.double`
+        A tensor representing the x coordinate in the plane of the orbit.
+    Y : :class:`torch.Tensor` of :class:`torch.double`
+        A tensor representing the y coordinate in the plane of the orbit.
+    omega : float
+        A tensor representing an argument of periastron [radians] Default 0.0.
+    incl : float
+        A tensor representing an inclination value [radians]. Default 0.0.
+    Omega : float
+        A tensor representing the position angle of the ascending node in [radians].
+        Default 0.0
+
+    Returns
+    -------
+    2-tuple of :class:`torch.Tensor` of :class:`torch.double`
         Two tensors representing ``(x, y)`` in the flat frame.
     """
-    # Rotation matrices result in a *clockwise* rotation of the axes, 
+    # Rotation matrices result in a *clockwise* rotation of the axes,
     # as defined using the righthand rule.
-    # For example, looking down the z-axis, a positive angle will rotate the 
+    # For example, looking down the z-axis, a positive angle will rotate the
     # x,y axes clockwise.
-    # A vector in the coordinate system will appear as though it has been 
+    # A vector in the coordinate system will appear as though it has been
     # rotated counter-clockwise.
 
     # 1) inverse rotation about Z axis by Omega -> x2, y2, z2
-    if Omega is not None:
-        cos_Omega = torch.cos(torch.as_tensor(Omega))
-        sin_Omega = torch.sin(torch.as_tensor(Omega))
+    cos_Omega = np.cos(Omega)
+    sin_Omega = np.sin(Omega)
 
-        x2 = cos_Omega * X + sin_Omega * Y
-        y2 = -sin_Omega * X + cos_Omega * Y
-    else:
-        x2 = X
-        y2 = Y
+    x2 = cos_Omega * X + sin_Omega * Y
+    y2 = -sin_Omega * X + cos_Omega * Y
 
     # 2) inverse rotation about x2 axis by incl
     x1 = x2
     # we don't know Z, but we can solve some equations to find that
     # y = Y / cos(i), as expected by intuition
-    if incl is not None:
-        y1 = y2 / torch.cos(torch.as_tensor(incl))
-    else:
-        y1 = y2
-
+    y1 = y2 / np.cos(incl)
+    
     # 3) inverse rotation about the z1 axis by an amount of omega
-    if omega is not None:
-        cos_omega = torch.cos(torch.as_tensor(omega))
-        sin_omega = torch.sin(torch.as_tensor(omega))
-
-        x = x1 * cos_omega + y1 * sin_omega
-        y = -x1 * sin_omega + y1 * cos_omega
-    else:
-        x = x1
-        y = y1
+    cos_omega = np.cos(omega)
+    sin_omega = np.sin(omega)
+
+    x = x1 * cos_omega + y1 * sin_omega
+    y = -x1 * sin_omega + y1 * cos_omega
 
     return x, y

From e33a7eec6f9d450b5beb0316ac22562b95bfc7b8 Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 22:23:29 +0000
Subject: [PATCH 25/26] updated changelog.

---
 docs/changelog.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/changelog.md b/docs/changelog.md
index f4229f85..4d360b71 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -3,8 +3,8 @@
 # Changelog
 
 ## v0.2.1
-
-- *Placeholder* Planned changes described by Architecture GitHub Project.
+- Made some progress converting docstrings from "Google" style format to "NumPy" style format. Ian is now convinced that NumPy style format is more readable for the type of docstrings we write in MPoL. We usually require long type definitions and long argument descriptions, and the extra indentation required for Google makes these very scrunched.
+- Make the `passthrough` behaviour of :class:`mpol.images.ImageCube` the default and removed this parameter entirely. Previously, it was possible to have :class:`mpol.images.ImageCube` act as a layer with `nn.Parameter`s. This functionality has effectively been replaced since the introduction of :class:`mpol.images.BaseCube` which provides a more useful way to parameterize pixel values. If a one-to-one mapping (including negative pixels) from `nn.Parameter`s to output tensor is desired, then one can specify `pixel_mapping=lambda x : x` when instantiating :class:`mpol.images.BaseCube`. More details in ([#246](https://github.com/MPoL-dev/MPoL/issues/246))
 - Removed convenience classmethods `from_image_properties` from across the code base. From [#233](https://github.com/MPoL-dev/MPoL/issues/233). The recommended workflow is to create a :class:`mpol.coordinates.GridCoords` object and pass that to instantiate these objects as needed, rather than passing `cell_size` and `npix` separately. For nearly all but trivially short workflows, this simplifies the number of variables the user needs to keep track and pass around revealing the central role of the :class:`mpol.coordinates.GridCoords` object and its useful attributes for image extent, visibility extent, etc. Most importantly, this significantly reduces the size of the codebase and the burden to maintain, test, and document multiple entry points to key `nn.modules`. We removed `from_image_properties` from
   - :class:`mpol.datasets.GriddedDataset`
   - :class:`mpol.datasets.Dartboard` 

From 10cb760464ffe8325d21c4ba30a5360524f821ea Mon Sep 17 00:00:00 2001
From: Ian Czekala <iancze@gmail.com>
Date: Wed, 27 Dec 2023 22:47:50 +0000
Subject: [PATCH 26/26] test on 3.10 and 3.11, min 3.8 for install.

---
 .github/workflows/pre-release.yml | 2 +-
 pyproject.toml                    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/pre-release.yml b/.github/workflows/pre-release.yml
index ed50359a..41185d95 100644
--- a/.github/workflows/pre-release.yml
+++ b/.github/workflows/pre-release.yml
@@ -13,7 +13,7 @@ jobs:
       - name: Set up Python
         uses: actions/setup-python@v4
         with:
-          python-version: 3.8
+          python-version: "3.10"
       - name: Install package deps
         run: |
           pip install .[dev]
diff --git a/pyproject.toml b/pyproject.toml
index 9b2ba7c8..9a1b6097 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -23,7 +23,7 @@ description = "Regularized Maximum Likelihood Imaging for Radio Astronomy"
 dynamic = ["version"]
 name = "MPoL"
 readme = "README.md"
-requires-python = ">=3.10"
+requires-python = ">=3.8"
 
 [project.optional-dependencies]
 dev = [