Combined Statistical-Dynamical downscaling of wind components over British Isles and Surrounding waters

This file provides the installation and run details to be able to perform multivariate linear regression, ridge regression, lasso regression and bilinear interpolation on low resolution ERA5 renanalysis data, to generate high resolution data.

Task

The aim of this project was to use statistical learning methods to downscale wind components from 30x30 km resolution to 3x3 km resolution, see the plot below - Left shows the low resolution wind data taken from the ERA5 land reanalysis. Right shows the data at higher resolution, produced by a mesoscale atmospheric model.

Results

Below shows a comparison of the linear regression method versus the commonly used method, namely bi-linear interpolation. A clear improvement can be seen compared to the baseline especially in regions of complex terrain, highlighting the ability of the MLR method to accruately capture the topological features of the ground truth data.

Installation

to set-up the environment, you just need to run the following command with ananconda:

conda env create -f environment.yml

the env is quite large, this may take a few minutes.

Data acquisition

To run the models you'll need to download the ERA5 data, this can be done using the codes inside ERA_data_source, but firstly, you'll need to make an account at this link

Once you have a login in you'll need to Install the CDS API key; see this link for more details

Now you're all set up to make requests to download ERA5 data. To get the data used for this study, you just need to run the .py files inside ERA_data_source. Each one of these will take several hours to download.

Running the interpolation model.

All you need to do is run inter_day.py, this will generate 4 years of interpolated data, for the 4 regions considered in the study.

It's currently set to interpolate the U component, but this can be changed to V by changing the paramters passed to get_period_interp_region. you need to change:

component_str = 'u100' to component_str = 'v100'
hr_component_str = 'U' to hr_component_str = 'V'

You'll have to email me however if you want to test the interpolated model, because the ground truth data required is too large to include (>10GB).

Running the MLR model.

This can be done with eff_lin_regress.py. There are many model parameters, which I'll outline here: you can change these in the script itself to run lasso and ridge regression. This code trains and validates the models, and spits out the final results (on the years of data which you specify) to a file, which again you can specify the name of. Also, the code outputs spatial and temporal error plots to new directories which are created when you run the file.

#here we define which met variables to use and their associated component strings

component_files = [u100_file,v100_file,vort_file,temp_file,pres_file]
component_strs = ['u100','v100','vo','t','sp']

#define the target variable string

target_str = 'U' ## can be this or 'V'

#define the penalty - #None implies unregularised MLR, other options are 'L1' or 'L2'

model_penalty = None

#define the locations to perform analysis.

locs = ['fort_augustus'] # I've only provided enough data to do this region, you'll have to contact me if you wan't the ground truth for the other regions.

#define training year

train_year = (1,5) #this means years 2001,2002,2003,2004 for training

#define validation years

validation_years = [(6,7),(7,8)] # this means 2006-2008
val_year = (validation_years[0][0],validation_years[-1][1])

#define number of previous time points

taus = [0]

#define number of nearest neighbours

nn = 4

#define region size

region_size = (50,50)

#define whether we should track the weights and save them

track_coeffs = False

#define if we should optimise and validate our choice of lambda

val_lambda = False
opt_lambda = False

#partitions defines the number of vars you want in the subset.

partition = None #Only define this if you want to do subset analysis with the most important variables. This must be an int.

notes

There are various other .py files which are just utility functions mainly, I've omitted lots of the .py files which I used for analysing model results etc. If you would like these, please get in touch.

Authors

Ian Goddard

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
ERA_data_source		ERA_data_source
images		images
jan01_corrected		jan01_corrected
testing_code		testing_code
.gitignore		.gitignore
.gitignore~		.gitignore~
README.md		README.md
compare.py		compare.py
eff_lin_regress.py		eff_lin_regress.py
environment.yml		environment.yml
get_mapping.py		get_mapping.py
inter_day.py		inter_day.py
interpolate_test.py		interpolate_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Combined Statistical-Dynamical downscaling of wind components over British Isles and Surrounding waters

Task

Results

Installation

Data acquisition

Running the interpolation model.

Running the MLR model.

notes

Authors

About

Releases

Packages

Languages

Ianlmgoddard/Statistical-donwscaling-u-v-wind-components---sample-code

Folders and files

Latest commit

History

Repository files navigation

Combined Statistical-Dynamical downscaling of wind components over British Isles and Surrounding waters

Task

Results

Installation

Data acquisition

Running the interpolation model.

Running the MLR model.

notes

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages