Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct Unexpected floats when reading LI L2 LFL #2998

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
13 changes: 4 additions & 9 deletions satpy/readers/li_base_nc.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,7 +306,6 @@ def get_projection_config(self):
"""Retrieve the projection configuration details."""
# We retrieve the projection variable name directly from our swath settings:
proj_var = self.swath_coordinates["projection"]

geos_proj = self.get_measured_variable(proj_var, fill_value=None)
# cast projection attributes to float/str:
major_axis = float(geos_proj.attrs["semi_major_axis"])
Expand Down Expand Up @@ -355,9 +354,9 @@ def generate_coords_from_scan_angles(self):
# Finally, we should store those arrays as internal variables for later retrieval as
# standard datasets:
self.internal_variables[lon_name] = xr.DataArray(
da.asarray(lon), dims=["y"], attrs={"standard_name": "longitude"})
da.asarray(lon), dims=["y"], attrs={"standard_name": "longitude"}).astype(np.float32)
self.internal_variables[lat_name] = xr.DataArray(
da.asarray(lat), dims=["y"], attrs={"standard_name": "latitude"})
da.asarray(lat), dims=["y"], attrs={"standard_name": "latitude"}).astype(np.float32)

def inverse_projection(self, azimuth, elevation, proj_dict):
"""Compute inverse projection."""
Expand Down Expand Up @@ -439,12 +438,11 @@ def get_measured_variable(self, var_paths, fill_value=np.nan):
# Also handle fill value here (but only if it is not None, so that we can still bypass this
# step if needed)
arr = self.apply_fill_value(arr, fill_value)

return arr

def apply_fill_value(self, arr, fill_value):
"""Apply fill values, unless it is None."""
if fill_value is not None:
"""Apply fill values, unless it is None and when _FillValue is provided in the array attributes."""
if fill_value is not None and arr.attrs.get("_FillValue") is not None:
if np.isnan(fill_value):
fill_value = np.float32(np.nan)
arr = arr.where(arr != arr.attrs.get("_FillValue"), fill_value)
Expand Down Expand Up @@ -597,9 +595,7 @@ def apply_use_rescaling(self, data_array, ds_info=None):
# TODO remove scaling_factor fallback after issue in NetCDF is fixed
scale_factor = attribs.setdefault("scale_factor", attribs.get("scaling_factor", 1))
add_offset = attribs.setdefault("add_offset", 0)

data_array = (data_array * scale_factor) + add_offset

# rescale the valid range accordingly
if "valid_range" in attribs.keys():
attribs["valid_range"] = attribs["valid_range"] * scale_factor + add_offset
Expand Down Expand Up @@ -742,7 +738,6 @@ def get_dataset(self, dataset_id, ds_info=None):
# Retrieve default infos if missing:
if ds_info is None:
ds_info = self.get_dataset_infos(dataset_id["name"])

# check for potential error:
if ds_info is None:
raise KeyError(f"No dataset registered for {dataset_id}")
Expand Down
4 changes: 1 addition & 3 deletions satpy/readers/li_l2_nc.py
Original file line number Diff line number Diff line change
Expand Up @@ -158,11 +158,9 @@ def get_array_on_fci_grid(self, data_array: xr.DataArray):
data_2d = da.map_blocks(_np_add_at_wrapper, data_2d, (rows, cols), data_array,
dtype=data_array.dtype,
chunks=(LI_GRID_SHAPE[0], LI_GRID_SHAPE[1]))
data_2d = da.where(data_2d > 0, data_2d, np.nan)

data_2d = da.where(data_2d > 0, data_2d, np.nan).astype(np.float32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would prevent one unnecessary upcasting:

Suggested change
data_2d = da.where(data_2d > 0, data_2d, np.nan).astype(np.float32)
data_2d = da.where(data_2d > 0, data_2d, np.float32(np.nan))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pnuu casting a nan who is a float32 is not enough to convert all the arrays into a float32 . For example if data_2d is an int32 the method where will convert it to a float64. To prevent it I have used the method astype(np.float32).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense I guess.

But for integer data the data would still be converted to floats, which seemed to be the original problem reported #2854 , right? So should there be separate handling for the integer data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this specific case, handling the accumulated LI 2-d data arrays, we need floats since we need to support NaN values. So I'm ok with the solution here that avoids float64.

The problem in the original issue was that some integer variables, that do not have a FillValue attribute, were still being casted unnecessarily to float; that problem is fixed by the other modification of this PR in apply_fill_value here https://github.com/pytroll/satpy/pull/2998/files#diff-3b2bff08b4001ec6f72cca67791cc4322b38e0db97e68f2109791093e56e6052R445

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok.

The current version will cast the data to float64 (when setting the 64-bit nan) before recasting it to float32. So for the staying in float32 world I think the data_2d array should be converted to float32 before applying the nan:

data_2d = data_2d.astype(np.float32)
data_2d = da.where(data_2d > 0, data_2d, np.nan)

Here the np.nan respects the original dtype and does not do an intermediate up-casting for the integer data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modification applied

xarr = xr.DataArray(da.asarray(data_2d, CHUNK_SIZE), dims=("y", "x"))
xarr.attrs = attrs

return xarr


Expand Down
Loading
Loading