Outflows from Lake Ontario are regulated at the Moses-Saunders Dam, which is located downstream on the St. Lawrence River near Cornwall, Ontario and Massena, New York. The Moses-Saunders Dam spans the border of the United States and Canada and is jointly managed by the two countries by the International Lake Ontario - St. Lawrence River Board, at the direction of the International Joint Commission. The current flow regulation plan of the LOSLR system is Plan 2014, which is the first control policy in the system to use forecasts to guide release decisions. The repository contains a framework to identify alternative forecast-informed reservoir operating (FIRO) policies using a simulation and optimization approach.
There are two major components to the workflow in this repository: the simulation and optimization of alternative outflow control policies and the visualization and data analysis on optimized alternatives. This repository contains code to:
- Optimize control policies
- Simulate plan prescribed outflows and water levels
- Assess policy performance for system objectives
- Explore results in an interactive dashboard
- Perform a time-varying sensitivity analysis (and other analyses) on policy alternatives
For more information on running the policy simulation and optimization framework, see Getting Started.
Disclaimer: While the Bv7 rule curve and flow limit functions have been verified using the Great Lakes Regulation and Routing Model, this repository is not intended to simulate official Plan 2014 prescribed outflows and simulate the resulting water levels.
There are seven major components required to simulate and optimize alternative control policies, which are described in more detail below:
- Configuration File
- Optimization Algorithm
- Input Data File
- Release Function
- Flow Limit Function
- Routing Function
- Objective Functions
This repository makes a key assumption that a control policy is made up of two modules: a release function and flow limits (or operational adjustments). The release function calculates a preliminary flow, which is then checked against a series of flow limits and modified if the preliminary flow falls outside of those flow limits. Flow limits are a major component of flow regulation in the Plan 2014 control policy (e.g. I-limit, L-limit, F-limit). See Annex B, Section B2 in the Plan 2014 Compendium Report for more details. The Bv7 ruleset is included in this repository; however, this repository does not contain code to simulate Board deviations decisions under the H14 criteria.
The optimization requires several hyperparameters, decision variables, and simulation modules. These fields are specified in a user-generated configuration file. Configuration files are written using the toml file format. A template and examples of a configuration file can be found here. The variables that must be specified in a configuration file are described below.
[experimentalDesign]
These parameters specify the input files and functions used to guide policy simulation. Each variable should be a str
.
[experimentalDesign]
# file name of the release function (without .py extension)
releaseFunction = "ruleCurve" # type:str
# whether to impose the September Rule (R+) regime on the release function release ["on" or "off"]
septemberRule = "off" # type:str
# file name of the flow limit function (without .py extension)
limitType = "Bv7" # type:str
# file name of the routing function (without .py extension)
stlawRouting = "stlaw" # type:str
# folder name of the hydrologic trace that contains input data that is being optimized over
trace = "historic" # type:str
# path and file name of the input data that is being optimized over
inputFile = "1900_2020/12month_sqAR" # type:str
[optimizationParameters]
These are parameters needed to run the many-objective evolutionary algorithm, Borg. Each variable should be an int
.
[optimizationParameters]
# number of decision variables to optimize
numDV = 10 # type: int
# number of objectives
numObj = 7 # type: int
# number of constraints
numCon = 0 # type: int
# number of function evaluations
nfe = 200000 # type: int
# initial population size
popSize = 100 # type: int
# frequency of function evaluations to report metrics
metFreq = 100 # type: int
[decisionVariables]
These parameters specify information about the decision varibles. Each variable type is specified below.
[decisionVariables]
# list of decision variables names - list of `str` of length of numDV
dvName = []
# list of lower bounds of decision variable ranges - list of `float` of length of numDV
lowerBounds = []
# list of upper bounds of decision variable ranges - list of `float` of length of numDV
upperBounds = []
# whether the decision variables are normalized before input to the simulation model ["True" or "False"]
normalized = ""
# if normalized is True, specify the normalization range
normalizedRange = [int, int]
[releaseFunction]
This sections contains specific inputs needed for the user specified release function. These inputs are completely dependent on the release function specified in experimentalDesign.
[releaseFunction]
releaseFunctionVariable1 = ""
releaseFunctionVariable2 = ""
[performanceIndicators]
These parameters specify information about the performance indicators (i.e. objective functions). Each variable type is specified below.
[performanceIndicators]
# file name of the objective function formulation to use. should correspond to a directory in `objectiveFunctions/`
objectiveFunction = "" # type: str
# aggregate metric to return to optimization algorithm. `str`
metricWeighting = ""
# list of the script names of each of the PI models to run. files should be in the in `objectiveFunctions/[formulation]/functions/` directory - list of `str` of length numObj
objectiveModels = []
# list of performance indicator names - list of `str` of length numObj
piName = []
# list of thresholds of *meaningful* improvements/reductions in performance for each obejctive - list of `float` of length numObj
epsilonValue = []
# list of the direction of improvement for each objective - list of "min" or "max" of length numObj
direction = []
A many-objective evolutionary algorithm (MOEA) is used to optimize control policies for flow regulation. The optimization algorithm used in this repository is the Borg MOEA. Before any runs, you will need to download and compile Borg. A two-part tutorial on setup (with an example) is available here by the Reed Lab at Cornell University. Once you have compiled Borg, you can introduce new simulation and evaluation problems. You will need to move the borg.c
, borg.py
, and libborg.so
to the directory with your wrapper script.
Input hydrologic files are provided for the historic supply data from 1900 - 2020 (input/historic/hydro
). The required inputs for policy simulation and optimization are described here.
Release function scripts should contain functions: formatDecisionVariables
to format decision variables, getReleaseFunctionInputs
to extract the release function inputs from the dictionary of input data, and releaseFunction
to prescribe a preliminary flow based on the inputs. Each function's inputs and outputs are described below.
formatDecisionVariables()
# format raw decision variables from optimization algorithm
def formatDecisionVariables(vars, **args):
# INPUTS
# vars: list of decision variable values returned from the Borg MOEA
# args: dict of optional release function inputs from the configuration file in "releaseFunction" section
# OUTPUTS
# pars: dict with key value pairs of decision variable names and values
# code to format decision variables values for `releaseFunction` ...
return pars
getReleaseFunctionInputs()
# extracts timestep inputs for `releaseFunction`
def getReleaseFunctionInputs(data, t, **args):
# INPUTS
# data: dictionary of input time series from main simulation function
# keys are variable names and values are np.arrays of the time series of the variable values
# t: timestep being simulated in for loop
# args: dict of optional release function inputs from the configuration file in "releaseFunction" section
# OUTPUTS
# x: dictionary with named key value pairs of hydrologic inputs at the
# timestep of interest, calculated or formatted as needed
# code to extract, calculate, or format needed inputs for `releaseFunction`....
return x
releaseFunction()
# takes in output from formatDecisionVariables and getInputs, outputs release and flow regime
def releaseFunction(x, pars, **args):
# INPUTS
# x: dict that is output from `getReleaseFunctionInputs`
# pars: dict that is output from `formatDecisionVariables`
# args: dict of optional release function inputs from the configuration file in "releaseFunction" section
# OUTPUTS
# dictionary with named key value pairs:
# "rfFlow": prescribed outflow
# "rfRegime": regime that prescribed outflow follows
# "pprFlow": preproj flow or np.nan
# "rfOutput": output from release function (could be the same as ontFlow)
# code for release function here....
# return all the relevant outputs to save in dataframe
outputs = dict()
outputs["rfFlow"] = ontFlow
outputs["rfRegime"] = ontRegime
outputs["pprFlow"] = pprFlow # or np.nan
outputs["rfOutput"] = rfOutput
return outputs
Examples for the Plan 2014 rule curve and the ANN policy appoximator release functions can found here. A blank template of a release function can be found here.
Flow limit function scripts should contain functions: getPlanLimitsInputs
to extract the limit function inputs from the dictionary of input data and planLimits
to check the preliminary flow against the flow limits and modify if needed. Each function's inputs and outputs are described below.
getPlanLimitsInputs()
# extracts timestep inputs for `planLimits`
def getPlanLimitsInputs(data, t):
# INPUTS
# data: dictionary of input time series from main simulation function
# keys are variable names and values are np.arrays of the time series of
# the variable values
# t: timestep from simulation for loop
# OUTPUTS
# x: dictionary with named key value pairs of hydrologic inputs at the
# timestep of interest
# code to extract, calculate, or format needed inputs for `planLimits`....
return x
planLimits()
# function to check (and modify) preliminary flow from release function
def planLimits(
qm,
prelimLevel,
prelimFlow,
prelimRegime,
x,
septemberRule,
):
# INPUTS
# qm: quarter-month from simulation for loop
# prelimLevel: release function calculated preliminary water level
# prelimFlow: release function calculated preliminary flow
# prelimRegime: release function calculated preliminary regime
# x: dict that is output from `getPlanLimitsInputs`
# septemberRule: "off" or the septemberRule function
# OUTPUTS
# dictionary with named key value pairs
# "ontFlow": checked outflow (could be release function or a limit flow)
# "ontRegime": regime that outflow follows (could be "RF" or other)
# code to check against flow limits here....
return {"ontFlow": ontFlow, "ontRegime": ontRegime}
Examples for the Bv7 or Phase 2 updated Bv7 flow limit functions can found here. A blank template of a flow limit function can be found here.
Routing function scripts should contain functions: getStLawrenceRoutingInputs
to extract the routing function inputs from the dictionary of input data and stLawrenceRouting
to route the outflow through the system and determine water levels along Lake Ontario and the St. Lawrence River. Each function's inputs and outputs are described below.
getStLawrenceRoutingInputs()
# extracts timestep inputs for `stLawrenceRouting`
def getStLawrenceRoutingInputs(data, t):
# INPUTS
# data: dictionary of input time series from main simulation function
# keys are variable names and values are np.arrays of the time series of
# the variable values
# t: timestep from simulation for loop
# OUTPUTS
# x: dictionary with named key value pairs of hydrologic inputs at the
# timestep of interest
# code to extract, calculate, or format needed inputs for `stLawrenceRouting`....
return x
stLawrenceRouting()
# routes final outflow through system to calculate levels/flows along the St. Lawrence River
def stLawrenceRouting(ontLevel, ontFlow, x):
# INPUTS
# ontLevel: water level calculated with observed NTS and final release
# ontFlow: final outflow for timestep
# x: dictionary output from `getStLawrenceRoutingInputs`
# OUTPUTS
# dictionary with named key value pairs for relevant locations along the st. lawrence river (see below for all locations)
# code to calculate st. lawrence levels and flows ....
# save timestep in dictionary
levels = dict()
levels["stlouisFlow"] = stlouisFlow
levels["kingstonLevel"] = kingstonLevel_rounded
levels["alexbayLevel"] = alexbayLevel
levels["brockvilleLevel"] = brockvilleLevel
levels["ogdensburgLevel"] = ogdensburgLevel
levels["cardinalLevel"] = cardinalLevel
levels["iroquoishwLevel"] = iroquoishwLevel
levels["iroquoistwLevel"] = iroquoistwLevel
levels["morrisburgLevel"] = morrisburgLevel
levels["longsaultLevel"] = longsaultLevel
levels["saundershwLevel"] = saundershwLevel
levels["saunderstwLevel"] = saunderstwLevel
levels["cornwallLevel"] = cornwallLevel
levels["summerstownLevel"] = summerstownLevel
levels["lerybeauharnoisLevel"] = lerybeauharnoisLevel
levels["ptclaireLevel"] = ptclaireLevel
levels["jetty1Level"] = jetty1Level
levels["stlambertLevel"] = stlambertLevel
levels["varennesLevel"] = varennesLevel
levels["sorelLevel"] = sorelLevel
levels["lacstpierreLevel"] = lacstpierreLevel
levels["maskinongeLevel"] = maskinongeLevel
levels["troisrivieresLevel"] = troisrivieresLevel
levels["batiscanLevel"] = batiscanLevel
return levels
Examples for the routing function using SLON flows can found here. A blank template of a routing function can be found here.
Objective functions are simulated over the user-specified time period. Each objective is aggregated by the net annual average value, and that metric is returned to Borg to drive the optimization. More information on the objective function formulations and required inputs can be found here.
The raw output from the Borg MOEA is stored in output/data/**/raw
, where **
is the folder of the experiment of interest. There will be a pareto_front_S*.txt
and a runtime_S*.txt
file for each seed the optimization ran. To visualize results in the dashboard, the raw data needs to be formatted and the MOEA performance metrics need to be calculated using the MOEA Framework. There is a shell script provided to format the raw data and run the MOEA Framework. Run the dataPrep.sh
script from terminal and specify the following:
./dataPrep.sh ${loc} ${folderName} ${nseeds} ${nobjs}
${loc}
: the absolute path of the user's home directory (i.e., where the code repository is located)${folderName}
: the name name of the folder that contains the raw Borg results (in theoutput/
directory)${nseeds}
: the number of random seeds to use in the optimization${nobjs}
: the number of objectives for which the optimization was run
To format the raw output for the dashboard and other post analyses, run the following command and specifying the following:
python output/postScripts/dataFormat.py ${loc} ${folderName} ${nseeds}
${loc}
: the absolute path of the user's home directory (i.e., where the code repository is located)${folderName}
: the name name of the folder that contains the raw Borg results (in theoutput/
directory)${nseeds}
: the number of random seeds to use in the optimization
The script will create and populate the clean/
and moeaFramework/
directories in output/data/**
.
To run the MOEA Framework, download the demo application, and move the .jar
file to the moeaFramework/
directory. You may need to update the version number of the .jar
called in the find_metrics.sh
script:
# ensure the version number [2.12] matches your download or update `JAVA_ARGS` accordingly
JAVA_ARGS="-cp MOEAFramework-2.12-Demo.jar"
We recommend using the wfg2
method to calculate the hypervolume metric. Follow the directions here to compile the executable for the wfg2
hypervolume calculation and move wfg2
to the moeaFramework/
directory. The global.properties
file in the moeaFramework/
directory tells the MOEA Framework to use the wfg2
calculation for the hypervolume metric.
Once everything is compiled and setup, ensure there are four files in the moeaFramework/
directory: wfg2
, MOEAFramework-#-Demo.jar
, global.properties
, and find_metrics.sh
.
The dataPrep.sh
script calls the find_metrics.sh
script in the moeaFramework/
directory. The find_metrics.sh
script will calculate the hypervolume, generational distance, inverted generational distance, spacing, epsilon indicator, and maximum pareto front error for each of the seeds. Convergence plots are saved in the output/data/**/moeaFramework/plots
directory.
Results from the optimization and simulation of candidate plans for objective functions and a posteriori performance indicators are displayed in an interactive dashboard (based in an R Shiny app). First, compile the results by running the following command and specifying the following:
You can then run the dashboard by navigating to the dashboard/
directory, activating the conda environment, and launching r:
conda activate r-shiny # activate shiny specific conda environment
cd dashboard/ # navigate to folder where dashboard script is stored
r # launch r
shiny::runApp() # launch app
More detail on running the dashboard is available here.
Once you launch the dashboard, you can select experiments to load in and visualize. Click the Export Table
button to save a satisficingPolicies.csv
file for policy simulation.
Borg returns the decision variables values of policy that fall on the Pareto Frontier. However, the time series of water levels and objective performance is not returned. To run the simulation model and return the time series of hydrologic attributes and performance indicators, run the policySimulation.py
script. To run the policySimulation.py
for a specified hydrologic trace, run the following:
python output/postScripts/dataFormat.py ${loc} ${folderName} ${inputTrace} ${inputData}
${loc}
: the absolute path of the user's home directory (i.e., where the code repository is located)${folderName}
: the name name of the folder that contains the raw Borg results (in theoutput/
directory)${inputTrace}
: the directory name that contains the input data [historic, stochastic, climate change, etc...]${inputData}
: input data file name without extension, may be an empty string to simulate all traces in a specified folder
This repository contains code to performa a time-varying sensitivity analysis on candidate policies. Users can add additional analysis scripts to the output/
folder and call them in the postAnalysis.sh
script to run.
There are two scripts that drive the policy simulation and optimization: optimizationWrapper.py
and optimizationSimulation.py
. The optimizationWrapper.py
script calls and interacts with the Borg MOEA. The wrapper script reads the user-generated configuration file and sets up the optimization experiment. optimizationWrapper.py
then calls optimizationSimulation.py
to simulate the time series of outflows, water levels, and system performance that result from the decision variables returned by Borg in each function evaluation.
You can run Borg on your local machine from the command line or on a HPC (see example bash or SLURM script). Both scripts call the optimizationWrapper.py
script, which requires three user-specified arguments:
python optimizationWrapper.py ${loc} ${config} ${S}
${loc}
: the absolute path of the user's home directory (i.e., where the code repository is located)${config}
: the relative path to the configuration file from the home directory${S}
: the random seed to use in the optimization
To ensure convergence on the Pareto Frontier and avoid falling into local minima/maxima, it is advisable to run multiple seeds per experiment. The runOptimization_Local.sh
and runOptimization_HPC.sh
shell scripts are setup to take in the number of seeds to run per experiment rather than the random seed. To run the shell script locally (or submit a SLURM job on a HPC), users must specify three arguments:
./runOptimization_Local.sh ${loc} ${config} ${nseeds}
${loc}
: the absolute path of the user's home directory (i.e., where the code repository is located)${config}
: the relative path to the configuration file from the home directory${nseeds}
: the number of random seeds to use in the optimization