Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: LDO use filled data #540

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

e-koch
Copy link
Contributor

@e-koch e-koch commented Mar 5, 2019

@keflavich -- We have some odd behaviour in the LDO classes from inheriting from u.Quantity and the mix-in classes that used to only be for cubes. For example, the cubes will use cube.filled_data[:] for most operations and a Projection uses proj.value.

Other operations like changing the fill_value were also failing (and weren't being tested) because the mask was always being initialized to all be True. With these changes, the mask can now be updated but changing the fill value only changes filled_data and not the Quantity array.

A bunch of tests are still missing. Before adding them, I figured we should decide on how to handle these behaviours. One option would be to stop inheriting from Quantity and implement the numpy function backend used for the cubes. It seems like any solution here could get messy.

@keflavich
Copy link
Contributor

The numpy function backend is kind of awkward for non-cubes; it doesn't add much.

Can you highlight the problem with code snippet? Is it that the LDOs, when treated as quantities, are not np-maskedarray-like objects? That's probably how they should be treated.

Worse case, we might need to replace the current functionality (cube[:,0,0] is an array) with something more awkward (cube[0,:,:].quantity is an array with units, cube[:,0,0] is an array without units).

@e-koch
Copy link
Contributor Author

e-koch commented Mar 5, 2019

For a projection with nans:

proj_fill = proj.with_fill_value(0.)
np.isnan(proj_fill).any()

will return True because proj_fill.value is not changed. Only proj_fill.filled_data[:] will contain the fill value:

np.isnan(proj_fill.filled_data[:]),any()  # False

So the numpy functions will default to the Quantity array, which is not affected by our masking routines. proj.filled_data[:] should be used throughout to have the same behaviour as the spectral-cube operations.

We would need to update the .value in the LDO when the mask is applied or is changed. Maybe this isn't too bad if we can utilize the np-maskedarray operations and tie that into the SC mask classes?

@keflavich
Copy link
Contributor

Yeah... this is tricky. I think we do want to utilize the np masked array tools, which involves (afaik) setting the .mask attribute, but it also means that fill_value is only going to be meaningful for a limited subset of operations and perhaps should not be a global LDO attribute. fill_value is useful for io, probably nothing else?

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.02%) to 77.861% when pulling e88fcac on e-koch:ldo_use_filled_data into 7523dc8 on radio-astro-tools:master.

@e-koch
Copy link
Contributor Author

e-koch commented Mar 5, 2019

Alright, I'll start exploring that. It may be possible to just not use the cube mask classes and assume we won't need things like lazy masking for LDOs (seems reasonable to me).

I think it is useful for a number of operations (I need it for convolving heavily masked maps). It's probably more important to have the same behaviour as the cube operations, though.

@keflavich
Copy link
Contributor

Right, it's probably important to determine which cases care about the distinction between masked-value-is-nan, masked-value-is-np-masked, masked-value-is-number, and masked-value-isn't-masked (the last of which we want to avoid). Convolution should be happy with either of the first two.

@keflavich
Copy link
Contributor

Can we continue outlining how this will work in practice? I think the most important thing about this PR is documentation.

Some examples:

>>> cube, data = cube_and_raw('adv.fits')
>>> cube.with_mask(cube>0.5*u.K)[:,:,0]
<Slice [[0.54671028, 0.96958463, 0.93949894],
        [0.59789998,        nan,        nan],
        [       nan, 0.82873751,        nan],
        [       nan,        nan, 0.77224477]] K>
>>> cube.with_mask(cube>0.5*u.K).with_fill_value(-1)[:,:,0]
<Slice [[ 0.54671028,  0.96958463,  0.93949894],
        [ 0.59789998, -1.        , -1.        ],
        [-1.        ,  0.82873751, -1.        ],
        [-1.        , -1.        ,  0.77224477]] K>
>>> slc = cube.with_mask(cube>0.5*u.K)[:,:,0]
>>> slc
<Slice [[0.54671028, 0.96958463, 0.93949894],
        [0.59789998,        nan,        nan],
        [       nan, 0.82873751,        nan],
        [       nan,        nan, 0.77224477]] K>
>>> slc.with_fill_value(0)
<Slice [[0.54671028, 0.96958463, 0.93949894],
        [0.59789998,        nan,        nan],
        [       nan, 0.82873751,        nan],
        [       nan,        nan, 0.77224477]] K>

The last bit represents the behavior that we want to change; with_fill_value should change those nans to -1s, but that's not obviously possible.

Similarly, we presently have no slc.unmasked_data view, but that's probably what we want as a default. Maybe we can have the repr always access slc.filled_data[:]? Then we need to be careful about how we treat .quantity and .value: those should, imo, default to slc.filled_data[:], but perhaps be accompanied by a deprecation warning saying "you should either use slc.filled_quantity or slc.unmasked_quantity or the value equivalents. How does that sound? What cases am I missing here?

@keflavich
Copy link
Contributor

A nice thing with the above approach is that arithmetic operations should in general return the same data with preserved masks, e.g. slc * 2 is a valid operation that returns a slice. But I'm sure this introduces interesting corners...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants