Open
Description
In my IPCC analyses I had to remove many simulations (model-variable-, model-scenario-ensemble-, or other combinations). Even for the cleaned 'new-generation' repos, mesmer has to remove some of the simulations. For IPCC I did that in the data processing loop but I think it would be better done in the 'find the simulations' part (i.e. in filefinder). So we should add an ExcludeFilter
(better names always welcome). We'd need to think about the way metadata for the excluded simulations is passed.
For IPCC I have a function which identifies matching metadata:
But maybe could also use pandas machinery, e.g. isin()
:
# NOTE: untested
conditions = [
# remove AWI ocean data: has an unstructured grid
{
"table": ["Oday", "Ofx", "Omon", "SIday", "SImon"],
"model": ["AWI-CM-1-1-MR", "AWI-ESM-1-1-LR"],
},
# tasmax and tasmin are wrong for CESM
{
"table": "day",
"varn": ["tasmax", "tasmin"],
"model": ["CESM2", "CESM2-WACCM"],
},
...
]
to_keep = True
for condition in conditions:
to_keep |= ~ all(df[key].isin(cond) for key, cond in condition.items()))
df = df.iloc[to_keep]
Metadata
Metadata
Assignees
Labels
No labels