Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add and validate GEV preprocessing step #31

Open
perrette opened this issue Oct 14, 2024 · 6 comments
Open

Add and validate GEV preprocessing step #31

perrette opened this issue Oct 14, 2024 · 6 comments
Assignees
Labels
preprocessing Processing data to feed into rime(X) rimeX

Comments

@perrette
Copy link
Collaborator

After the NGFS workshop preparation call, and preliminary discussion with @NiklasSchwind and Carl, we'll want to do GEV fitting inside the 21-year window instead of calculating the climatic average. This would require some validation and possible variations such as detrending the data inside the window, to avoid artificially distorting the GEV distribution.

@perrette perrette added rimeX preprocessing Processing data to feed into rime(X) labels Oct 14, 2024
@perrette perrette self-assigned this Oct 14, 2024
@NiklasSchwind
Copy link
Collaborator

Good idea! However, as the GEV median and the mean won't exactly match and this introduces another assumption I would keep it an optional preprocessing step :)

@perrette
Copy link
Collaborator Author

perrette commented Oct 14, 2024

Does it come to mind because of the importance of skewness in GEV fitting? On the other hand, precisely because of its importance, we might want to correct for it. Anyway, we can do sensitivity tests, see how strong the impact is, and decide then what the default should be. On a more general note, the processing of climate model data is not an exact science...

@perrette
Copy link
Collaborator Author

But I do agree detrending comes with tradeoffs and it needs careful assessment of whether the cure is better than the ill. E..g probably not a good idea to detrending selectively on 21 years worth of local data. If anything, we'd need to model the trend in a robust manner that does not add to the variability. A reasonably elegant (because self consistent) idea would be to use our model for the 21-year mean, perhaps (but not necessarily) calibrated on a per model basis, to remove the mean.

@NiklasSchwind
Copy link
Collaborator

NiklasSchwind commented Oct 15, 2024

I wouldn't put it as default as I think that doing that would limit the applicability of the emulator to GEV-distributed extreme indicators. E.g. rx1day is distributed with a GEV but tas is probably distributed normally, and flood depth probably has a completely different underlying distribution. So using the GEV per default would limit the applicability of our emulator to rx1day in this case while keeping it optional adds to the applicability.

I even wouldn't count the GEV fit/mean as a part of the emulator per se but as a part of the definition of the underlying indicator. (So we emulate indicators like e.g. "1-in-20 year event of rx1day" or "21-year-mean precipitation").

@NiklasSchwind
Copy link
Collaborator

Note about an idea by Carl (will elaborate further tomorrow):
Instead of fitting one GEV on every 20-year window, one could try to fit a (linear?) function predicting the parameters of the GEV distribution from GMT to each simulation.

@perrette
Copy link
Collaborator Author

perrette commented Oct 15, 2024

Here I am only talking about these variables that we assume follow a GEV distribution, like 1-in-X year events.
The reason why I propose a dedicated processing for these variables, instead of treating them as a regular variable, is because they are already a statistics over time. There would not be a justification to take another 21-year, "climatological" mean of these variables, because they are already a climatological indicator. The question remains of whether they need detrending or not, and whether that should be done by default, but I tend to thing they do, and I maintain the line put forward above that we could use our classical emulator for the "mean" climatological variable as the trend, and use the GEV fitting on the daily variable minus the trend (I'd probably use a smooth GMT value to compute the trend). I don't see exactly what you find problematic here, nor the difference with predicting the parameters of the GEV. To predict the parameters of the GEV, you need to fit the GEV first, right? So accordintg to my understanding, in every 21-yr period, you'd do a GEV fitting on the (probably detrended) time-series derive the 1-in-X year event values, thus obtaining new time-series for each of the desired return periods. Later on this would not be fundamentally different from the other indicators, except that wouldn't take the 21-year mean of these values, you'd just use them directly. We can discuss that and other ideas on a call perhaps. Also with Carl if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preprocessing Processing data to feed into rime(X) rimeX
Projects
None yet
Development

No branches or pull requests

2 participants