Make it possible to filter out all NaN values #65

sferics · 2023-08-09T17:30:56Z

Is your feature request related to a problem? Please describe.

I tried to use the "filters" flag of the read_bufr function to filter out NaN values.
My filter was a very simple lambda function: filter = lambda x : pandas.notna(x)

When I used it to get rid of missing data of a single parameter, it worked fine. But as I took many parameters, the returned pandas DataFrame shrunk and did not contain the desired data anymore, or it was even empty.

I suspect that this is due to the nature of the filter conditions. In the documentation, you mention that they are connected with logical AND: https://pdbufr.readthedocs.io/en/latest/read_bufr.html#combining-conditions

The problem for me is that without filtering I get a quite big DataFrame with many missing values which I have to get rid of afterwards. I've noticed that a lot of columns actually just contain NaN values.

Describe the solution you'd like

It would be nice to have the option to connect conditions with logical OR instead. Maybe that could already solve my problem.

Describe alternatives you've considered

Another solution I can imagine is having the option to use the equivalent of "df.loc[:, parameter].notna().any()" on each column (parameter) before returning the DataFrame. If this condition returns True for a column, i.e., it only consists of missing values, the column gets dropped.

Ideally, this would be done before the DataFrame is created internally.

Additional context

My solution for now is that I call df.dropna(how="all") on both axis after I've created the DataFrame. But this is not a very efficient way to do it, especially for large amount of data.

Organisation

Meteo Service weather research

sandorkertesz · 2023-08-09T17:52:31Z

Please see #58

sferics · 2023-08-09T18:05:35Z

Oh, thanks! I overlooked that... Yes, that is exactly what I meant. I would be really happy to see such a feature in this great piece of software in future.
Keep up the good work! Best regards

sferics added the enhancement New feature or request label Aug 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make it possible to filter out all NaN values #65

Make it possible to filter out all NaN values #65

sferics commented Aug 9, 2023

sandorkertesz commented Aug 9, 2023

sferics commented Aug 9, 2023

Make it possible to filter out all NaN values #65

Make it possible to filter out all NaN values #65

Comments

sferics commented Aug 9, 2023

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Organisation

sandorkertesz commented Aug 9, 2023

sferics commented Aug 9, 2023