Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split Gridder averaging and imaging capabilities into different classes #154

Closed
iancze opened this issue Feb 18, 2023 · 1 comment
Closed
Assignees

Comments

@iancze
Copy link
Collaborator

iancze commented Feb 18, 2023

This issue was originally raised in #126 but is now split into it's own, more focused issue.

... I think major redesigns to GriddedDataset and Gridder are necessary to address the Hermitian pairs issue raised in #100. To solve the Hermitian pairs issue, I think it makes sense to break apart the gridding->imaging and gridding->GriddedDataset capabilities currently contained in Gridder. Although these gridding functionalities are similar, I think trying to combine them all in a single class actually makes things more complicated than they need to be.

Originally, I wrote the Gridder class to mimic the "gridding" of individual visibility data points to a UV grid so that they could be imaged using the inverse FFT, creating the dirty image. This is very useful for visualization and debugging and I propose that we keep something like this (perhaps with a more descriptive name, that somehow indicates its ability to produce dirty images). As far as I can currently tell, all of the code pertaining to the gridding->imaging functionalities is straight numpy.

It turns out that gridding visibilities using uniform weighting is equivalent to taking a cell-average of the visibility value. You can do a similar operation to the visibility weights. Both of these are needed for gridding->GriddedDataset. This way you can take, say, ~100 individual visibility measurements (and their weights) that fall in a single cell, perform a weighted average, and then equivalently represent the same measurement by a single average visibility value and a new (higher) equivalent weight value. Then, any computation using these visibilities goes much faster. All of this code could be straight numpy too, with the exception of export to a Pytorch GriddedDataset.

Since the gridding operation itself is kind of tedious and involves a lot of cell indexing and hacking of the np.histogram routine, it seemed like it made sense to combine both of these gridding routines into a single pass. This leads to some weird branching of what kind of "gridding" to use where, like in to_pytorch_dset().

But the biggest drawback of combining these two use cases in a single object (which I've only realized now that we've flushed out #100), is that it means our gridded dataset used in RML and other forward modeling contains Hermitian pairs (when it shouldn't). It's not a big deal for any routine that is just interested in the MLE or MAP estimates, but it will result in erroneous uncertainties on any routine that explores parameter uncertainties.

We should flesh this out a bit more, but my proposal for reorganization to the Gridder is to make two separate classes. The first will be purely for (uniform) averaging of visibilities to create a GriddedDataset. In its __init__ function it will not conjugate the visibilities to produce Hermitian pairs. This class will need to keep routines like _sum_cell_values_channel, _sum_cell_values_cube, _grid_weights, _check_scatter_error, _estimate_cell_standard_deviation, and a version of _grid_visibilities that only does uniform weighting.

Then, the second class is for imaging. I think it could reasonably inherit from the first class. It would need to modify __init__ such that visibilities are conjugated to produce Hermitian pairs, and it would introduce all of the routines for gridding with Briggs weighting, calculating beam areas, etc.

I think this would really help simplify this area of the codebase and prevent many errors pertaining to inference.

@iancze
Copy link
Collaborator Author

iancze commented Feb 21, 2023

Closed by #156

@iancze iancze closed this as completed Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant