ICLR release.

dso-org · Jan 14, 2021 · 5777861 · 5777861
commit 5777861
Showing 40 changed files with 5,955 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,13 @@
+*.DS_Store
+*.pyc
+*.egg*
+venv*
+dsr/dsr/summary*
+*log_*
+.gitignore
+.ipynb_checkpoints
+~$*
+*.vscode/
+dsr/build
+dsr/dsr/cyfunc*
+**/log/
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,30 @@
+BSD 3-Clause License
+
+Copyright (c) 2018, Lawrence Livermore National Security, LLC
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+* Neither the name of the copyright holder nor the names of its
+  contributors may be used to endorse or promote products derived from
+  this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
diff --git a/NOTICE b/NOTICE
@@ -0,0 +1,21 @@
+This work was produced under the auspices of the U.S. Department of
+Energy by Lawrence Livermore National Laboratory under Contract
+DE-AC52-07NA27344.
+
+This work was prepared as an account of work sponsored by an agency of
+the United States Government. Neither the United States Government nor
+Lawrence Livermore National Security, LLC, nor any of their employees
+makes any warranty, expressed or implied, or assumes any legal liability
+or responsibility for the accuracy, completeness, or usefulness of any
+information, apparatus, product, or process disclosed, or represents that
+its use would not infringe privately owned rights.
+
+Reference herein to any specific commercial product, process, or service
+by trade name, trademark, manufacturer, or otherwise does not necessarily
+constitute or imply its endorsement, recommendation, or favoring by the
+United States Government or Lawrence Livermore National Security, LLC.
+
+The views and opinions of authors expressed herein do not necessarily
+state or reflect those of the United States Government or Lawrence
+Livermore National Security, LLC, and shall not be used for advertising
+or product endorsement purposes.
diff --git a/README.md b/README.md
@@ -0,0 +1,134 @@
+# Deep symbolic regression
+
+Deep symbolic regression (DSR) is a deep learning algorithm for symbolic regression--the task of recovering tractable mathematical expressions from an input dataset. The package `dsr` contains the code for DSR, including a single-point, parallelized launch script (`dsr/run.py`), baseline genetic programming-based symbolic regression algorithm, and an sklearn-like interface for use with your own data.
+
+This code supports the ICLR 2021 paper [Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients](https://openreview.net/forum?id=m5Qsh0kBQG).
+
+# Installation
+
+Installation is straightforward in a Python 3 virtual environment using Pip. From the repository root:
+
+```
+python3 -m venv venv3 # Create a Python 3 virtual environment
+source venv3/bin/activate # Activate the virtual environmnet
+pip install -r requirements.txt # Install Python dependencies
+export CFLAGS="-I $(python -c "import numpy; print(numpy.get_include())") $CFLAGS" # Needed on Mac to prevent fatal error: 'numpy/arrayobject.h' file not found
+pip install -e ./dsr # Install DSR package
+```
+
+To perform experiments involving the GP baseline, you will need the additional package `deap`.
+
+# Example usage
+
+To try out DSR, use the following command from the repository root:
+
+```
+python -m dsr.run ./dsr/dsr/config.json --b=Nguyen-6
+```
+
+This should solve in around 50 training steps (~30 seconds on a laptop).
+
+# Getting started
+
+## Configuring runs
+
+DSR uses JSON files to configure training.
+
+Top-level key "task" specifies details of the benchmark expression for DSR or GP. See docs in `regression.py` for details.
+
+Top-level key "training" specifies the training hyperparameters for DSR. See docs in `train.py` for details.
+
+Top-level key "controller" specifies the RNN controller hyperparameters for DSR. See docs for in `controller.py` for details.
+
+Top-level key "gp" specifies the hyperparameters for GP if using the GP baseline. See docs for `dsr.baselines.gspr.GP` for details.
+
+## Launching runs
+
+After configuring a run, launching it is simple:
+
+```
+python -m dsr.run [PATH_TO_CONFIG] [--OPTIONS]
+```
+
+## Sklearn interface
+
+DSR also provides an [sklearn-like regressor interface](https://scikit-learn.org/stable/modules/generated/sklearn.base.RegressorMixin.html). Example usage:
+
+```
+from dsr import DeepSymbolicRegressor
+import numpy as np
+
+# Generate some data
+np.random.seed(0)
+X = np.random.random((10, 2))
+y = np.sin(X[:,0]) + X[:,1] ** 2
+
+# Create the model
+model = DeepSymbolicRegressor("config.json")
+
+# Fit the model
+model.fit(X, y) # Should solve in ~10 seconds
+
+# View the best expression
+print(model.program_.pretty())
+
+# Make predictions
+model.predict(2 * X)
+```
+
+## Using an external dataset
+
+To use your own dataset, simply provide the path to the `"dataset"` key in the config, and give your task an arbitary name.
+
+```
+"task": {
+    "task_type": "regression",
+    "name": "my_task",
+    "dataset": "./path/to/my_dataset.csv",
+    ...
+}
+```
+
+Then run DSR:
+
+```
+python -m dsr.run path/to/config.json
+```
+
+Note the `--b` flag matches the name of the CSV file (-`.csv` ).
+
+## Command-line examples
+
+Show command-line help and quit
+
+```
+python -m dsr.run --help
+```
+
+Train 2 indepdent runs of DSR on the Nguyen-1 benchmark using 2 cores
+
+```
+python -m dsr.run config.json --b=Nguyen-1 --mc=2 --num_cores=2
+```
+
+Train DSR on all 12 Nguyen benchmarks using 12 cores
+
+```
+python -m dsr.run config.json --b=Nguyen --num_cores=12
+```
+
+Train 2 independent runs of GP on Nguyen-1
+
+```
+python -m dsr.run config.json --method=gp --b=Nguyen-1 --mc=2 --num_cores=2
+```
+
+Train DSR on Nguyen-1 and Nguyen-4
+
+```
+python -m dsr.run config.json --b=Nguyen-1 --b=Nguyen-4
+```
+
+# Release
+
+LLNL-CODE-647188
diff --git a/dsr/dsr/__init__.py b/dsr/dsr/__init__.py
@@ -0,0 +1,3 @@
+from dsr.core import DeepSymbolicOptimizer
+from dsr.task.regression.sklearn import DeepSymbolicRegressor
+
diff --git a/dsr/dsr/baselines/__init__.py b/dsr/dsr/baselines/__init__.py
diff --git a/dsr/dsr/baselines/constraints.py b/dsr/dsr/baselines/constraints.py
@@ -0,0 +1,128 @@
+"""Defines constraints for GP individuals, to be used as decorators for
+evolutionary operations."""
+
+from dsr.functions import UNARY_TOKENS, BINARY_TOKENS
+
+TRIG_TOKENS = ["sin", "cos", "tan", "csc", "sec", "cot"]
+
+# Define inverse tokens
+INVERSE_TOKENS = {
+    "exp" : "log",
+    "neg" : "neg",
+    "inv" : "inv",
+    "sqrt" : "n2"
+}
+
+# Add inverse trig functions
+INVERSE_TOKENS.update({
+    t : "arc" + t for t in TRIG_TOKENS
+    })
+
+# Add reverse
+INVERSE_TOKENS.update({
+    v : k for k, v in INVERSE_TOKENS.items()
+    })
+
+DEBUG = False
+
+
+def check_inv(ind):
+    """Returns True if two sequential tokens are inverse unary operators."""
+
+    names = [node.name for node in ind]
+    for i, name in enumerate(names[:-1]):
+        if name in INVERSE_TOKENS and names[i+1] == INVERSE_TOKENS[name]:
+            if DEBUG:
+                print("Constrained inverse:", ind)
+            return True
+    return False
+
+
+def check_const(ind):
+    """Returns True if children of a parent are all const tokens."""
+
+    names = [node.name for node in ind]
+    for i, name in enumerate(names):
+        if name in UNARY_TOKENS and names[i+1] == "const":
+            if DEBUG:
+                print("Constrained const (unary)", ind)
+            return True
+        if name in BINARY_TOKENS and names[i+1] == "const" and names[i+1] == "const":
+            if DEBUG:
+                print(print("Constrained const (binary)", ind))
+            return True
+    return False
+
+
+def check_trig(ind):
+    """Returns True if a descendant of a trig operator is another trig
+    operator."""
+
+    names = [node.name for node in ind]
+    trig_descendant = False # True when current node is a descendant of a trig operator
+    trig_dangling = None # Number of unselected nodes in trig subtree
+    for i, name in enumerate(names):
+        if name in TRIG_TOKENS:
+            if trig_descendant:
+                if DEBUG:
+                    print("Constrained trig:", ind)
+                return True
+            trig_descendant = True
+            trig_dangling = 1
+        elif trig_descendant:
+            if name in BINARY_TOKENS:
+                trig_dangling += 1
+            elif name not in UNARY_TOKENS:
+                trig_dangling -= 1
+            if trig_dangling == 0:
+                trig_descendant = False
+    return False
+
+
+def make_check_min_len(min_length):
+    """Creates closure for minimum length constraint"""
+
+    def check_min_len(ind):
+        """Returns True if individual is less than minimum length"""
+
+        if len(ind) < min_length:
+            if DEBUG:
+                print("Constrained min len: {} (length {})".format(ind, len(ind)))
+            return True
+
+        return False
+
+    return check_min_len
+
+
+def make_check_max_len(max_length):
+    """Creates closure for maximum length constraint"""
+
+    def check_max_len(ind):
+        """Returns True if individual is greater than maximum length"""
+
+        if len(ind) > max_length:
+            if DEBUG:
+                print("Constrained max len: {} (length {})".format(ind, len(ind)))
+            return True
+
+        return False
+
+    return check_max_len
+
+
+def make_check_num_const(max_const):
+    """Creates closure for maximum number of constants constraint"""
+
+    def check_num_const(ind):
+        """Returns True if individual has more than max_const const tokens"""
+
+        num_const = len([t for t in ind if t.name == "const"])
+        if num_const > max_const:
+            if DEBUG:
+                print("Constrained max const: {} ({} consts)".format(ind, num_const))
+            return True
+
+        return False
+
+    return check_num_const
diff --git a/dsr/dsr/baselines/gpsr.py b/dsr/dsr/baselines/gpsr.py
@@ -0,0 +1,297 @@
+import random
+import operator
+import importlib
+from functools import partial
+
+import numpy as np
+
+from dsr.functions import function_map
+from dsr.const import make_const_optimizer
+
+from . import constraints
+
+
+GP_MOD = "deap"
+OBJECTS = ["base", "gp", "creator", "tools", "algorithms"]
+gp = importlib.import_module(GP_MOD + ".gp")
+base = importlib.import_module(GP_MOD + ".base")
+creator = importlib.import_module(GP_MOD + ".creator")
+tools = importlib.import_module(GP_MOD + ".tools")
+algorithms = importlib.import_module(GP_MOD + ".algorithms")
+
+
+class GP():
+    """Genetic-programming based symbolic regression class"""
+
+    def __init__(self, dataset, metric="nmse", population_size=1000,
+                 generations=1000, n_samples=None, tournament_size=3,
+                 p_crossover=0.5, p_mutate=0.1,
+                 const_range=[-1, 1], const_optimizer="scipy",
+                 const_params=None, seed=0, early_stopping=False,
+                 threshold=1e-12, verbose=True, protected=True,
+                 pareto_front=False,
+                 # Constraint hyperparameters
+                 constrain_const=True,
+                 constrain_trig=True,
+                 constrain_inv=True,
+                 constrain_min_len=True,
+                 constrain_max_len=True,
+                 constrain_num_const=True,
+                 min_length=4,
+                 max_length=30,
+                 max_const=3):
+
+        self.dataset = dataset
+        self.fitted = False
+
+        assert n_samples is None or generations is None, "At least one of 'n_samples' or 'generations' must be None."
+        if generations is None:
+            generations = int(n_samples / population_size)
+
+        # Set hyperparameters
+        self.population_size = population_size
+        self.generations = generations
+        self.tournament_size = tournament_size
+        self.p_mutate = p_mutate
+        self.p_crossover = p_crossover
+        self.seed = seed
+        self.early_stopping = early_stopping
+        self.threshold = threshold
+        self.verbose = verbose
+        self.pareto_front = pareto_front
+
+        # Fitness function used during training
+        # Includes closure for fitness function metric and training data
+        fitness = partial(self.make_fitness(metric), y=dataset.y_train, var_y=np.var(dataset.y_train)) # Function of y_hat
+        self.fitness = partial(self.compute_fitness, optimize=True, fitness=fitness, X=dataset.X_train.T) # Function of individual
+
+        # Test NMSE, used as final performance metric
+        # Includes closure for test data
+        nmse_test = partial(self.make_fitness("nmse"), y=dataset.y_test, var_y=np.var(dataset.y_test)) # Function of y_hat
+        self.nmse_test = partial(self.compute_fitness, optimize=False, fitness=nmse_test, X=dataset.X_test.T) # Function of individual
+
+        # Noiseless test NMSE, only used to determine success for final performance
+        # Includes closure for noiseless test data
+        nmse_test_noiseless = partial(self.make_fitness("nmse"), y=dataset.y_test_noiseless, var_y=np.var(dataset.y_test_noiseless)) # Function of y_hat
+        self.nmse_test_noiseless = partial(self.compute_fitness, optimize=False, fitness=nmse_test_noiseless, X=dataset.X_test.T) # Function of individual
+        self.success = lambda ind : self.nmse_test_noiseless(ind)[0] < self.threshold # Function of individual
+
+        # Create the primitive set
+        pset = gp.PrimitiveSet("MAIN", dataset.X_train.shape[1])
+
+        # Add input variables
+        rename_kwargs = {"ARG{}".format(i) : "x{}".format(i + 1) for i in range(dataset.n_input_var)}
+        pset.renameArguments(**rename_kwargs)
+
+        # Add primitives
+        for op_name in dataset.function_set:
+            if op_name == "const":
+                continue
+            assert op_name in function_map, "Operation {} not recognized.".format(op_name)
+
+            # Prepend available protected operators with "protected_"
+            if protected and not op_name.startswith("protected_"):
+                protected_op_name = "protected_{}".format(op_name)
+                if protected_op_name in function_map:
+                    op_name = protected_op_name
+
+            op = function_map[op_name]
+            pset.addPrimitive(op.function, op.arity, name=op.name)
+
+        # # Add constant
+        # if "const" in dataset.function_set:
+        #     pset.addEphemeralConstant("const", lambda : random.uniform(const_range[0], const_range[1]))
+
+        # Add constant
+        const = "const" in dataset.function_set
+        if const:
+            const_params = const_params if const_params is not None else {}
+            self.const_opt = make_const_optimizer(const_optimizer, **const_params)
+            pset.addTerminal(1.0, name="const")
+
+        # Create custom fitness and individual classes
+        if self.pareto_front:
+            # Fitness it compared lexographically, so second dimension
+            # (complexity) is only used in selection if first dimension (error)
+            # is the same.
+            creator.create("FitnessMin", base.Fitness, weights=(-1.0, -1.0))
+        else:
+            creator.create("FitnessMin", base.Fitness, weights=(-1.0,))
+        creator.create("Individual", gp.PrimitiveTree, fitness=creator.FitnessMin)
+
+        # Define the evolutionary operators
+        self.toolbox = base.Toolbox()
+        self.toolbox.register("expr", gp.genHalfAndHalf, pset=pset, min_=1, max_=2)
+        self.toolbox.register("individual", tools.initIterate, creator.Individual, self.toolbox.expr)
+        self.toolbox.register("population", tools.initRepeat, list, self.toolbox.individual)
+        self.toolbox.register("compile", gp.compile, pset=pset)
+        self.toolbox.register("evaluate", self.fitness)
+        self.toolbox.register("select", tools.selTournament, tournsize=tournament_size)
+        self.toolbox.register("mate", gp.cxOnePoint)
+        self.toolbox.register("expr_mut", gp.genFull, min_=0, max_=2)
+        self.toolbox.register('mutate', gp.mutUniform, expr=self.toolbox.expr_mut, pset=pset)
+
+        # Define constraints, each defined by a func : gp.Individual -> bool.
+        # We decorate mutation/crossover operators with constrain, which
+        # replaces a child with a random parent if func(ind) is True.
+        constrain = partial(gp.staticLimit, max_value=0) # Constraint decorator
+        funcs = []
+        if constrain_min_len:
+            funcs.append(constraints.make_check_min_len(min_length)) # Minimum length
+        if constrain_max_len:
+            funcs.append(constraints.make_check_max_len(max_length)) # Maximum length
+        if constrain_inv:
+            funcs.append(constraints.check_inv) # Subsequence inverse unary operators
+        if constrain_trig:
+            funcs.append(constraints.check_trig) # Nested trig operators
+        if constrain_const and const:
+            funcs.append(constraints.check_const) # All children are constants
+        if constrain_num_const and const:
+            funcs.append(constraints.make_check_num_const(max_const)) # Number of constants
+        for func in funcs:
+            for variation in ["mate", "mutate"]:
+                self.toolbox.decorate(variation, constrain(func))
+
+        # Create the training function
+        self.algorithm = algorithms.eaSimple
+
+
+    def compute_fitness(self, individual, fitness, X, optimize=False):
+        """Compute the given fitness function on an individual using X."""
+
+        if optimize:
+            # Retrieve symbolic constants
+            const_idxs = [i for i, node in enumerate(individual) if node.name == "const"]
+
+            # Check if best individual (or any individual in Pareto front) has success=True
+            # (i.e. NMSE below threshold on noiseless test set)
+            if self.early_stopping and any([self.success(ind) for ind in self.hof]):
+                return (999,)
+
+        if optimize and len(const_idxs) > 0:
+
+            # Objective function for evaluating constants
+            def obj(consts):                
+                for i, const in zip(const_idxs, consts):
+                    individual[i] = gp.Terminal(const, False, object)
+                    individual[i].name = "const" # For good measure
+                f = self.toolbox.compile(expr=individual)
+                y_hat = f(*X)
+                y = self.dataset.y_train
+                if np.isfinite(y_hat).all():
+                    # Squash error to prevent consts from becoming inf
+                    return -1/(1 + np.mean((y - y_hat)**2))
+                else:
+                    return 0
+
+            # Do the optimization and set the optimized constants
+            x0 = np.ones(len(const_idxs))
+            optimized_consts = self.const_opt(obj, x0)
+            for i, const in zip(const_idxs, optimized_consts):
+                individual[i] = gp.Terminal(const, False, object)
+                individual[i].name = "const" # This is necessary to ensure the constant is re-optimized in the next generation
+
+        # Execute the individual
+        f = self.toolbox.compile(expr=individual)
+        with np.errstate(all="ignore"):
+            y_hat = f(*X)
+
+        # Check for validity
+        if np.isfinite(y_hat).all():
+            fitness = (fitness(y_hat=y_hat),)
+        else:
+            fitness = (np.inf,)
+
+        # Compute complexity (only if using Pareto front)
+        if self.pareto_front:
+            complexity = sum([function_map[prim.name].complexity \
+                                if prim.name in function_map \
+                                else 1 for prim in individual])                    
+            fitness += (complexity,)
+
+        return fitness
+
+
+    def train(self):
+        """Train the GP"""
+
+        if self.fitted:
+            raise RuntimeError("This GP has already been fitted!")
+
+        random.seed(self.seed)
+
+        pop = self.toolbox.population(n=self.population_size)
+        if self.pareto_front:
+            self.hof = tools.ParetoFront()
+        else:
+            self.hof = tools.HallOfFame(maxsize=1)
+
+        stats_fit = tools.Statistics(lambda p : p.fitness.values[0])
+        stats_fit.register("avg", np.mean)
+        stats_fit.register("min", np.min)
+        stats_size = tools.Statistics(len)
+        stats_size.register("avg", np.mean)
+        mstats = tools.MultiStatistics(fitness=stats_fit, size=stats_size)
+
+        pop, logbook = self.algorithm(population=pop,
+                                      toolbox=self.toolbox,
+                                      cxpb=self.p_crossover,
+                                      mutpb=self.p_mutate,
+                                      ngen=self.generations,
+                                      stats=mstats,
+                                      halloffame=self.hof,
+                                      verbose=self.verbose)
+
+        self.fitted = True
+
+        # Delete custom classes
+        del creator.FitnessMin
+        del creator.Individual
+        if "const" in dir(gp):
+            del gp.const
+
+        # The best individual is the first one in self.hof with success=True,
+        # otherwise the highest reward. This mimics DSR's train.py.
+        ind_best = None
+        for ind in self.hof:
+            if self.success(ind):
+                ind_best = ind
+                break
+        ind_best = ind_best if ind_best is not None else self.hof[0] # first element in self.hof is the fittest
+
+        if self.verbose:
+            print("Printing {}:".format("Pareto front" if self.pareto_front else "hall of fame"))
+            print("Fitness  |  Individual")
+            for ind in self.hof:
+                print(ind.fitness, [token.name for token in ind])
+
+        return ind_best, logbook
+
+
+    def make_fitness(self, metric):
+        """Generates a fitness function by name"""
+
+        if metric == "mse":
+            fitness = lambda y, y_hat, var_y : np.mean((y - y_hat)**2)
+
+        elif metric == "rmse":
+            fitness = lambda y, y_hat, var_y : np.sqrt(np.mean((y - y_hat)**2))
+
+        elif metric == "nmse":
+            fitness = lambda y, y_hat, var_y : np.mean((y - y_hat)**2 / var_y)
+
+        elif metric == "nrmse":
+            fitness = lambda y, y_hat, var_y : np.sqrt(np.mean((y - y_hat)**2 / var_y))
+
+        # Complementary inverse NMSE
+        elif metric == "cinv_nmse":
+            fitness = lambda y, y_hat, var_y : 1 - 1/(1 + np.mean((y - y_hat)**2 / var_y))
+
+        # Complementary inverse NRMSE
+        elif metric == "cinv_nrmse":
+            fitness = lambda y, y_hat, var_y : 1 - 1/(1 + np.sqrt(np.mean((y - y_hat)**2 / var_y)))
+
+        else:
+            raise ValueError("Metric not recognized.")
+
+        return fitness
diff --git a/dsr/dsr/config.json b/dsr/dsr/config.json
@@ -0,0 +1,99 @@
+{
+   "task": {
+      "task_type" : "regression",
+      "name" : "Nguyen-1",
+      "function_set": null,
+      "dataset" : {
+         "name" : null,
+         "noise": null,
+         "dataset_size_multiplier": 1.0
+      },
+      "metric" : "inv_nrmse",
+      "metric_params" : [1.0],
+      "threshold" : 1e-12,
+      "protected" : false,
+      "reward_noise" : 0.0
+   },
+   "prior": {
+      "length" : {"min_" : 4, "max_" : 30},
+      "repeat" : {"tokens" : "const", "max_" : 3},
+      "inverse" : {},
+      "trig" : {},
+      "const" : {}
+   },
+   "training": {
+      "logdir": "./log",
+        "n_epochs": null,
+        "n_samples": 2000000,
+        "batch_size": 1000,
+        "complexity": "length",
+        "complexity_weight": 0.0,
+        "const_optimizer": "scipy",
+        "const_params": {},
+        "alpha": 0.5,
+        "epsilon": 0.05,
+        "verbose": true,
+        "baseline": "R_e",
+        "b_jumpstart": false,
+        "n_cores_batch": 1,
+        "summary": false,
+        "debug": 0,
+        "output_file": null,
+        "save_all_r": false,
+        "early_stopping": true,
+        "pareto_front": false,
+        "hof": 100
+   },
+   "controller": {
+      "cell": "lstm",
+      "num_layers": 1,
+      "num_units": 32,
+      "initializer": "zeros",
+      "embedding": false,
+      "embedding_size": 8,
+      "optimizer": "adam",
+      "learning_rate": 0.0005,
+      "observe_action": false,
+      "observe_parent": true,
+      "observe_sibling": true,
+      "entropy_weight": 0.005,
+      "ppo": false,
+      "ppo_clip_ratio": 0.2,
+      "ppo_n_iters": 10,
+      "ppo_n_mb": 4,
+      "pqt": false,
+      "pqt_k": 10,
+      "pqt_batch_size": 1,
+      "pqt_weight": 200.0,
+      "pqt_use_pg": false,
+      "max_length": 30
+   },
+   "gp": {
+      "population_size": 1000,
+      "generations": null,
+      "n_samples" : 2000000,
+      "tournament_size": 2,
+      "metric": "nmse",
+      "const_range": [
+         -1.0,
+         1.0
+      ],
+      "p_crossover": 0.95,
+      "p_mutate": 0.03,
+      "seed": 0,
+      "early_stopping": true,
+      "pareto_front": false,
+      "threshold": 1e-12,
+      "verbose": false,
+      "protected": true,
+      "constrain_const": true,
+      "constrain_trig": true,
+      "constrain_inv": true,
+      "constrain_min_len": true,
+      "constrain_max_len": true,
+      "constrain_num_const": true,
+      "min_length": 4,
+      "max_length": 30,
+      "max_const" : 3
+   }
+}
diff --git a/dsr/dsr/const.py b/dsr/dsr/const.py
@@ -0,0 +1,74 @@
+"""Constant optimizer used for deep symbolic regression."""
+
+from functools import partial
+
+import numpy as np
+from scipy.optimize import minimize
+
+
+def make_const_optimizer(name, **kwargs):
+    """Returns a ConstOptimizer given a name and keyword arguments"""
+
+    const_optimizers = {
+        None : Dummy,
+        "dummy" : Dummy,
+        "scipy" : ScipyMinimize,
+    }
+
+    return const_optimizers[name](**kwargs)
+
+
+class ConstOptimizer(object):
+    """Base class for constant optimizer"""
+
+    def __init__(self, **kwargs):
+        self.kwargs = kwargs
+
+
+    def __call__(self, f, x0):
+        """
+        Optimizes an objective function from an initial guess.
+
+        The objective function is the negative of the base reward (reward
+        without penalty) used for training. Optimization excludes any penalties
+        because they are constant w.r.t. to the constants being optimized.
+
+        Parameters
+        ----------
+        f : function mapping np.ndarray to float
+            Objective function (negative base reward).
+
+        x0 : np.ndarray
+            Initial guess for constant placeholders.
+
+        Returns
+        -------
+        x : np.ndarray
+            Vector of optimized constants.
+        """
+        raise NotImplementedError
+
+
+class Dummy(ConstOptimizer):
+    """Dummy class that selects the initial guess for each constant"""
+
+    def __init__(self, **kwargs):
+        super(Dummy, self).__init__(**kwargs)
+
+
+    def __call__(self, f, x0):
+        return x0
+
+
+class ScipyMinimize(ConstOptimizer):
+    """SciPy's non-linear optimizer"""
+
+    def __init__(self, **kwargs):
+        super(ScipyMinimize, self).__init__(**kwargs)
+
+
+    def __call__(self, f, x0):
+        with np.errstate(divide='ignore'):
+            opt_result = partial(minimize, **self.kwargs)(f, x0)
+        x = opt_result['x']
+        return x
diff --git a/dsr/dsr/controller.py b/dsr/dsr/controller.py
diff --git a/dsr/dsr/core.py b/dsr/dsr/core.py
@@ -0,0 +1,126 @@
+"""Core deep symbolic optimizer construct."""
+
+import json
+import zlib
+from collections import defaultdict
+from multiprocessing import Pool
+
+import tensorflow as tf
+
+from dsr.task import set_task
+from dsr.controller import Controller
+from dsr.train import learn
+from dsr.prior import make_prior
+from dsr.program import Program
+
+
+class DeepSymbolicOptimizer():
+    """
+    Deep symbolic optimization model. Includes model hyperparameters and
+    training configuration.
+
+    Parameters
+    ----------
+    config : dict or str
+        Config dictionary or path to JSON. See dsr/dsr/config.json for template.
+
+    Attributes
+    ----------
+    config : dict
+        Configuration parameters for training.
+
+    Methods
+    -------
+    train
+        Builds and trains the model according to config.
+    """
+
+    def __init__(self, config=None):
+        self.update_config(config)
+        self.sess = None
+
+    def setup(self, seed=0):
+
+        # Clear the cache, reset the compute graph, and set the seed
+        Program.clear_cache()
+        tf.reset_default_graph()
+        self.seed(seed) # Must be called _after_ resetting graph
+
+        self.pool = self.make_pool()
+        self.sess = tf.Session()
+        self.prior = self.make_prior()
+        self.controller = self.make_controller()
+
+    def train(self, seed=0):
+
+        # Setup the model
+        self.setup(seed)
+
+        # Train the model
+        result = learn(self.sess,
+                       self.controller,
+                       self.pool,
+                       **self.config_training)
+        return result
+
+    def update_config(self, config):
+        if config is None:
+            config = {}
+        elif isinstance(config, str):
+            with open(config, 'rb') as f:
+                config = json.load(f)
+
+        self.config = defaultdict(dict, config)
+        self.config_task = self.config["task"]
+        self.config_prior = self.config["prior"]
+        self.config_training = self.config["training"]
+        self.config_controller = self.config["controller"]
+
+    def seed(self, seed_=0):
+        """Set the tensorflow seed, which will be offset by a checksum on the
+        task name to ensure seeds differ across different tasks."""
+
+        if "name" in self.config_task:
+            task_name = self.config_task["name"]
+        else:
+            task_name = ""
+        seed_ += zlib.adler32(task_name.encode("utf-8"))
+        tf.set_random_seed(seed_)
+
+        return seed_
+
+    def make_prior(self):
+        prior = make_prior(Program.library, self.config_prior)
+        return prior
+
+    def make_controller(self):
+        controller = Controller(self.sess,
+                                self.prior,
+                                **self.config_controller)
+        return controller
+
+    def make_pool(self):
+        # Create the pool and set the Task for each worker
+        pool = None
+        n_cores_batch = self.config_training.get("n_cores_batch")
+        if n_cores_batch is not None and n_cores_batch > 1:
+            pool = Pool(n_cores_batch,
+                        initializer=set_task,
+                        initargs=(self.config_task,))
+
+        # Set the Task for the parent process
+        set_task(self.config_task)
+
+        return pool
+
+    def save(self, save_path):
+
+        saver = tf.train.Saver()
+        saver.save(self.sess, save_path)
+
+    def load(self, load_path):
+
+        if self.sess is None:
+            self.setup()
+        saver = tf.train.Saver()
+        saver.restore(self.sess, load_path)
diff --git a/dsr/dsr/cyfunc.pyx b/dsr/dsr/cyfunc.pyx
@@ -0,0 +1,90 @@
+'''
+# cython: linetrace=True
+# distutils: define_macros=CYTHON_TRACE_NOGIL=1
+'''
+# Uncomment the above lines for cProfile
+
+import numpy as np
+import array
+
+# Cython specific C imports
+cimport numpy as np
+from cpython cimport array
+cimport cython
+from libc.stdlib cimport malloc, free
+from cpython.ref cimport PyObject
+
+# Static inits
+cdef list apply_stack   = [[None for i in range(25)] for i in range(1024)]
+cdef int *stack_count   = <int *> malloc(1024 * sizeof(int))
+
+@cython.boundscheck(False) # turn off bounds-checking for entire function
+@cython.wraparound(False)  # turn off negative index wrapping for entire function  
+def execute(np.ndarray X, int len_traversal, list traversal, int[:] is_input_var):    
+
+    """Executes the program according to X.
+
+    Parameters
+    ----------
+    X : array-like, shape = [n_samples, n_features]
+        Training vectors, where n_samples is the number of samples and
+        n_features is the number of features.
+    
+    Returns
+    -------
+    y_hats : array-like, shape = [n_samples]
+        The result of executing the program on X.
+    """
+    #sp              = 0 # allow a dummy first row, requires a none type function with arity of -1
+
+    # Init some ints
+    cdef int        sp              = -1 # Stack pointer
+    cdef int        Xs              = X.shape[0]
+
+    # Give cdef hints for object types  
+    cdef int        i
+    cdef int        n
+    cdef int        arity
+    cdef np.ndarray intermediate_result
+    cdef list       stack_end
+    cdef object     stack_end_function
+
+    for i in range(len_traversal):
+
+        if not is_input_var[i]:
+            sp += 1
+            # Move this to the front with a memset call
+            stack_count[sp]                     = 0
+            # Store the reference to stack_count[sp] rather than keep calling
+            apply_stack[sp][stack_count[sp]]    = traversal[i]
+            stack_end                           = apply_stack[sp]
+            # The first element is the function itself
+            stack_end_function                  = stack_end[0]
+            arity                               = stack_end_function.arity
+        else:
+            # Not a function, so lazily evaluate later
+            stack_count[sp] += 1
+            stack_end[stack_count[sp]]          = X[:, traversal[i].input_var]
+
+        # Keep on doing this so long as arity matches up, we can 
+        # add in numbers above and complete the arity later.
+        while stack_count[sp] == arity:
+            intermediate_result = stack_end_function(*stack_end[1:(stack_count[sp] + 1)]) # 85% of overhead
+
+            # I think we can get rid of this line, but will require a major rewrite.
+            if sp == 0:    
+                return intermediate_result
+
+            sp -= 1
+            # Adjust pointer at the end of the stack
+            stack_end                   = apply_stack[sp]
+            stack_count[sp] += 1
+            stack_end[stack_count[sp]]  = intermediate_result
+
+            # The first element is the function itself
+            stack_end_function          = stack_end[0]
+            arity                       = stack_end_function.arity
+
+    # We should never get here
+    assert False, "Function should never get here!"
+    return None
diff --git a/dsr/dsr/functions.py b/dsr/dsr/functions.py
@@ -0,0 +1,195 @@
+"""Common Tokens used for executable Programs."""
+
+import numpy as np
+from fractions import Fraction
+
+from dsr.library import Token, PlaceholderConstant
+
+GAMMA = 0.57721566490153286060651209008240243104215933593992
+
+
+"""Define custom unprotected operators"""
+def logabs(x1):
+    """Closure of log for non-positive arguments."""
+    return np.log(np.abs(x1))
+
+def expneg(x1):
+    return np.exp(-x1)
+
+def n3(x1):
+    return np.power(x1, 3)
+
+def n4(x1):
+    return np.power(x1, 4)
+
+def sigmoid(x1):
+    return 1 / (1 + np.exp(-x1))
+
+def harmonic(x1):
+    if all(val.is_integer() for val in x1):
+        return np.array([sum(Fraction(1, d) for d in range(1, int(val)+1)) for val in x1], dtype=np.float32)
+    else:
+        return GAMMA + np.log(x1) + 0.5/x1 - 1./(12*x1**2) + 1./(120*x1**4)
+
+
+# Annotate unprotected ops
+unprotected_ops = [
+    # Binary operators
+    Token(np.add, "add", arity=2, complexity=1),
+    Token(np.subtract, "sub", arity=2, complexity=1),
+    Token(np.multiply, "mul", arity=2, complexity=1),
+    Token(np.divide, "div", arity=2, complexity=2),
+
+    # Built-in unary operators
+    Token(np.sin, "sin", arity=1, complexity=3),
+    Token(np.cos, "cos", arity=1, complexity=3),
+    Token(np.tan, "tan", arity=1, complexity=4),
+    Token(np.exp, "exp", arity=1, complexity=4),
+    Token(np.log, "log", arity=1, complexity=4),
+    Token(np.sqrt, "sqrt", arity=1, complexity=4),
+    Token(np.square, "n2", arity=1, complexity=2),
+    Token(np.negative, "neg", arity=1, complexity=1),
+    Token(np.abs, "abs", arity=1, complexity=2),
+    Token(np.maximum, "max", arity=1, complexity=4),
+    Token(np.minimum, "min", arity=1, complexity=4),
+    Token(np.tanh, "tanh", arity=1, complexity=4),
+    Token(np.reciprocal, "inv", arity=1, complexity=2),
+
+    # Custom unary operators
+    Token(logabs, "logabs", arity=1, complexity=4),
+    Token(expneg, "expneg", arity=1, complexity=4),
+    Token(n3, "n3", arity=1, complexity=3),
+    Token(n4, "n4", arity=1, complexity=3),
+    Token(sigmoid, "sigmoid", arity=1, complexity=4),
+    Token(harmonic, "harmonic", arity=1, complexity=4)
+]
+
+
+"""Define custom protected operators"""
+def protected_div(x1, x2):
+    with np.errstate(divide='ignore', invalid='ignore', over='ignore'):
+        return np.where(np.abs(x2) > 0.001, np.divide(x1, x2), 1.)
+
+def protected_exp(x1):
+    with np.errstate(over='ignore'):
+        return np.where(x1 < 100, np.exp(x1), 0.0)
+
+def protected_log(x1):
+    """Closure of log for non-positive arguments."""
+    with np.errstate(divide='ignore', invalid='ignore'):
+        return np.where(np.abs(x1) > 0.001, np.log(np.abs(x1)), 0.)
+
+def protected_sqrt(x1):
+    """Closure of sqrt for negative arguments."""
+    return np.sqrt(np.abs(x1))
+
+def protected_inv(x1):
+    """Closure of inverse for zero arguments."""
+    with np.errstate(divide='ignore', invalid='ignore'):
+        return np.where(np.abs(x1) > 0.001, 1. / x1, 0.)
+
+def protected_expneg(x1):
+    with np.errstate(over='ignore'):
+        return np.where(x1 > -100, np.exp(-x1), 0.0)
+
+def protected_n2(x1):
+    with np.errstate(over='ignore'):
+        return np.where(np.abs(x1) < 1e6, np.square(x1), 0.0)
+
+def protected_n3(x1):
+    with np.errstate(over='ignore'):
+        return np.where(np.abs(x1) < 1e6, np.power(x1, 3), 0.0)
+
+def protected_n4(x1):
+    with np.errstate(over='ignore'):
+        return np.where(np.abs(x1) < 1e6, np.power(x1, 4), 0.0)
+
+def protected_sigmoid(x1):
+    return 1 / (1 + protected_expneg(x1))
+
+# Annotate protected ops
+protected_ops = [
+    # Protected binary operators
+    Token(protected_div, "div", arity=2, complexity=2),
+
+    # Protected unary operators
+
+    Token(protected_exp, "exp", arity=1, complexity=4),
+    Token(protected_log, "log", arity=1, complexity=4),
+    Token(protected_log, "logabs", arity=1, complexity=4), # Protected logabs is support, but redundant
+    Token(protected_sqrt, "sqrt", arity=1, complexity=4),
+    Token(protected_inv, "inv", arity=1, complexity=2),
+    Token(protected_expneg, "expneg", arity=1, complexity=4),
+    Token(protected_n2, "n2", arity=1, complexity=2),
+    Token(protected_n3, "n3", arity=1, complexity=3),
+    Token(protected_n4, "n4", arity=1, complexity=3),
+    Token(protected_sigmoid, "sigmoid", arity=1, complexity=4)
+]
+
+# Add unprotected ops to function map
+function_map = {
+    op.name : op for op in unprotected_ops
+    }
+
+# Add protected ops to function map
+function_map.update({
+    "protected_{}".format(op.name) : op for op in protected_ops
+    })
+
+UNARY_TOKENS = set([op.name for op in function_map.values() if op.arity == 1])
+BINARY_TOKENS = set([op.name for op in function_map.values() if op.arity == 2])
+
+
+def create_tokens(n_input_var, function_set, protected):
+    """
+    Helper function to create Tokens.
+
+    Parameters
+    ----------
+    n_input_var : int
+        Number of input variable Tokens.
+
+    function_set : list
+        Names of registered Tokens, or floats that will create new Tokens.
+
+    protected : bool
+        Whether to use protected versions of registered Tokens.
+    """
+
+    tokens = []
+
+    # Create input variable Tokens
+    for i in range(n_input_var):
+        token = Token(name="x{}".format(i + 1), arity=0, complexity=1,
+                      function=None, input_var=i)
+        tokens.append(token)
+
+    for op in function_set:
+
+        # Registered Token
+        if op in function_map:
+            # Overwrite available protected operators
+            if protected and not op.startswith("protected_"):
+                protected_op = "protected_{}".format(op)
+                if protected_op in function_map:
+                    op = protected_op
+
+            token = function_map[op]
+
+        # Hard-coded floating-point constant
+        elif isinstance(op, float) or isinstance(op, int):
+            name = str(op)
+            value = np.atleast_1d(np.float32(op))
+            function = lambda : value
+            token = Token(name=name, arity=0, complexity=1, function=function)
+
+        # Constant placeholder (to-be-optimized)
+        elif op == "const":
+            token = PlaceholderConstant()
+
+        else:
+            raise ValueError("Operation {} not recognized.".format(op))
+
+        tokens.append(token)
+
+    return tokens
diff --git a/dsr/dsr/library.py b/dsr/dsr/library.py
@@ -0,0 +1,196 @@
+"""Classes for Token and Library"""
+
+from collections import defaultdict
+
+import numpy as np
+
+
+class Token():
+    """
+    An arbitrary token or "building block" of a Program object.
+
+    Attributes
+    ----------
+    name : str
+        Name of token.
+
+    arity : int
+        Arity (number of arguments) of token.
+
+    complexity : float
+        Complexity of token.
+
+    function : callable
+        Function associated with the token; used for exectuable Programs.
+
+    input_var : int or None
+        Index of input if this Token is an input variable, otherwise None.
+
+    Methods
+    -------
+    __call__(input)
+        Call the Token's function according to input.
+    """
+
+    def __init__(self, function, name, arity, complexity, input_var=None):
+        self.function = function
+        self.name = name
+        self.arity = arity
+        self.complexity = complexity
+        self.input_var = input_var
+
+        if input_var is not None:
+            assert function is None, "Input variables should not have functions."
+            assert arity == 0, "Input variables should have arity zero."
+
+    def __call__(self, *args):
+        assert self.function is not None, \
+            "Token {} is not callable.".format(self.name)
+
+        return self.function(*args)
+
+    def __repr__(self):
+        return self.name
+
+
+class PlaceholderConstant(Token):
+    """
+    A Token for placeholder constants that will be optimized with respect to
+    the reward function. The function simply returns the "value" attribute.
+
+    Parameters
+    ----------
+    value : float or None
+        Current value of the constant, or None if not yet set.
+    """
+
+    def __init__(self, value=None):
+        if value is not None:
+            value = np.atleast_1d(value)
+        self.value = value
+
+        def function():
+            assert self.value is not None, \
+                "Constant is not callable with value None."
+            return self.value
+
+        super().__init__(function=function, name="const", arity=0, complexity=1)
+
+    def __repr__(self):
+        if self.value is None:
+            return self.name
+        return str(self.value[0])
+
+
+class Library():
+    """
+    Library of Tokens. We use a list of Tokens (instead of set or dict) since
+    we so often index by integers given by the Controller.
+
+    Attributes
+    ----------
+    tokens : list of Token
+        List of available Tokens in the library.
+
+    names : list of str
+        Names corresponding to Tokens in the library.
+
+    arities : list of int
+        Arities corresponding to Tokens in the library.
+    """
+
+    def __init__(self, tokens):
+
+        self.tokens = tokens
+        self.L = len(tokens)
+        self.names = [t.name for t in tokens]
+        self.arities = np.array([t.arity for t in tokens], dtype=np.int32)
+
+        self.input_tokens = np.array(
+            [i for i, t in enumerate(self.tokens) if t.input_var is not None],
+            dtype=np.int32)
+
+        def get_tokens_of_arity(arity):
+            _tokens = [i for i in range(self.L) if self.arities[i] == arity]
+            return np.array(_tokens, dtype=np.int32)
+
+        self.tokens_of_arity = defaultdict(lambda : np.array([], dtype=np.int32))
+        for arity in self.arities:
+            self.tokens_of_arity[arity] = get_tokens_of_arity(arity)
+        self.terminal_tokens = self.tokens_of_arity[0]
+        self.unary_tokens = self.tokens_of_arity[1]
+        self.binary_tokens = self.tokens_of_arity[2]
+
+        try:
+            self.const_token = self.names.index("const")
+        except ValueError:
+            self.const_token = None
+        self.parent_adjust = np.full_like(self.arities, -1)
+        count = 0
+        for i in range(len(self.arities)):
+            if self.arities[i] > 0:
+                self.parent_adjust[i] = count
+                count += 1
+
+        trig_names = ["sin", "cos", "tan", "csc", "sec", "cot"]
+        trig_names += ["arc" + name for name in trig_names]
+
+        self.float_tokens = np.array(
+            [i for i, t in enumerate(self.tokens) if t.arity == 0 and t.input_var is None],
+            dtype=np.int32)
+        self.trig_tokens = np.array(
+            [i for i, t in enumerate(self.tokens) if t.name in trig_names],
+            dtype=np.int32)
+
+        inverse_tokens = {
+            "inv" : "inv",
+            "neg" : "neg",
+            "exp" : "log",
+            "log" : "exp",
+            "sqrt" : "n2",
+            "n2" : "sqrt"
+        }
+        token_from_name = {t.name : i for i, t in enumerate(self.tokens)}
+        self.inverse_tokens = {token_from_name[k] : token_from_name[v] for k, v in inverse_tokens.items() if k in token_from_name and v in token_from_name}        
+
+    def __getitem__(self, val):
+        """Shortcut to get Token by name or index."""
+
+        if isinstance(val, str):
+            try:
+                i = self.names.index(val)
+            except ValueError:
+                raise TokenNotFoundError("Token {} does not exist.".format(val))
+        elif isinstance(val, (int, np.integer)):
+            i = val
+        else:
+            raise TokenNotFoundError("Library must be indexed by str or int, not {}.".format(type(val)))
+
+        try:
+            token = self.tokens[i]
+        except IndexError:
+            raise TokenNotFoundError("Token index {} does not exist".format(i))
+        return token
+
+    def tokenize(self, inputs):
+        """Convert inputs to list of Tokens."""
+
+        if isinstance(inputs, str):
+            inputs = inputs.split(',')
+        elif not isinstance(inputs, list) and not isinstance(inputs, np.ndarray):
+            inputs = [inputs]
+        tokens = [input_ if isinstance(input_, Token) else self[input_] for input_ in inputs]
+        return tokens
+
+    def actionize(self, inputs):
+        """Convert inputs to array of 'actions', i.e. ints corresponding to
+        Tokens in the Library."""
+
+        tokens = self.tokenize(inputs)
+        actions = np.array([self.tokens.index(t) for t in tokens],
+                           dtype=np.int32)
+        return actions
+
+
+class TokenNotFoundError(Exception):
+    pass
diff --git a/dsr/dsr/memory.py b/dsr/dsr/memory.py
@@ -0,0 +1,358 @@
+"""Classes for memory buffers, priority queues, and quantile estimation."""
+
+import heapq
+from collections import namedtuple
+
+import numpy as np
+
+
+Batch = namedtuple(
+    "Batch", ["actions", "obs", "priors", "lengths", "rewards"])
+
+
+def make_queue(controller=None, priority=False, capacity=np.inf, seed=0):
+    """Factory function for various Queues.
+
+    Parameters
+    ----------
+    controller : dsr.controller.Controller
+        Reference to the Controller, used to compute probabilities of items in
+        the Queue.
+
+    priority : bool
+        If True, returns an object inheriting UniquePriorityQueue. Otherwise,
+        returns an object inheriting from UniqueQueue.
+
+    capacity : int
+        Maximum queue length.
+
+    seed : int
+        RNG seed used for random sampling.
+
+    Returns
+    -------
+    queue : ProgramQueue
+        Dynamic class inheriting from ProgramQueueMixin and a Queue subclass.
+    """
+
+    if priority:
+        Base = UniquePriorityQueue
+    else:
+        Base = UniqueQueue
+
+    class ProgramQueue(ProgramQueueMixin, Base):
+        def __init__(self, controller, capacity, seed):
+            ProgramQueueMixin.__init__(self, controller)
+            Base.__init__(self, capacity, seed)
+
+    queue = ProgramQueue(controller, capacity, seed)
+    return queue
+
+
+def get_samples(batch, key):
+    """
+    Returns a sub-Batch with samples from the given indices.
+
+    Parameters
+    ----------
+    key : int or slice
+        Indices of samples to return.
+
+    Returns
+    -------
+    batch : Batch
+        Sub-Batch with samples from the given indices.
+    """
+
+    batch = Batch(
+        actions=batch.actions[key],
+        obs=tuple(o[key] for o in batch.obs),
+        priors=batch.priors[key],
+        lengths=batch.lengths[key],
+        rewards=batch.rewards[key])
+    return batch
+
+
+# Adapted from https://github.com/tensorflow/models/blob/1af55e018eebce03fb61bba9959a04672536107d/research/brain_coder/common/utils.py
+class ItemContainer(object):
+    """Class for holding an item with its score.
+
+    Defines a comparison function for use in the heap-queue.
+    """
+
+    def __init__(self, score, item, extra_data):
+        self.item = item
+        self.score = score
+        self.extra_data = extra_data
+
+    def __lt__(self, other):
+        assert isinstance(other, type(self))
+        return self.score < other.score
+
+    def __eq__(self, other):
+        assert isinstance(other, type(self))
+        return self.item == other.item
+
+    def __iter__(self):
+        """Allows unpacking like a tuple."""
+        yield self.score
+        yield self.item
+        yield self.extra_data
+
+    def __repr__(self):
+        """String representation of this item.
+
+        `extra_data` is not included in the representation. We are assuming that
+        `extra_data` is not easily interpreted by a human (if it was, it should be
+        hashable, like a string or tuple).
+
+        Returns:
+            String representation of `self`.
+        """
+        return str((self.score, self.item))
+
+    def __str__(self):
+        return repr(self)
+
+
+class Queue(object):
+    """Abstract class for queue that must define a push and pop routine"""
+
+    def __init__(self, capacity, seed=0):
+        self.capacity = capacity
+        self.rng = np.random.RandomState(seed)
+        self.heap = []
+        self.unique_items = set()
+
+    def push(self, score, item, extra_data):
+        raise NotImplementedError
+
+    def pop(self):
+        raise NotImplementedError
+
+    def random_sample(self, sample_size):
+        """Uniform randomly select items from the queue.
+
+        Args:
+            sample_size: Number of random samples to draw. The same item can be
+                    sampled multiple times.
+
+        Returns:
+            List of sampled items (of length `sample_size`). Each element in the list
+            is a tuple: (item, extra_data).
+        """
+        idx = self.rng.choice(len(self.heap), sample_size, )
+        return [(self.heap[i].item, self.heap[i].extra_data) for i in idx]
+
+    def __len__(self):
+        return len(self.heap)
+
+    def __iter__(self):
+        for _, item, _ in self.heap:
+            yield item
+
+    def __repr__(self):
+        return '[' + ', '.join(repr(c) for c in self.heap) + ']'
+
+    def __str__(self):
+        return repr(self)
+
+
+class UniqueQueue(Queue):
+    """A queue in which duplicates are not allowed. Instead, adding a duplicate
+    moves that item to the back of the queue."""
+
+    def push(self, score, item, extra_data=None):
+        """Push an item onto the queue, or move it to the back if already
+        present.
+
+        Score is unused but included as an argument to follow the interface.
+        """
+
+        container = ItemContainer(None, item, extra_data)
+
+        # If the item is already in the queue, move it to the back of the queue
+        # and return
+        if item in self.unique_items:
+            self.heap.remove(container)
+            self.heap.append(container)
+            return
+
+        # If the queue is at capacity, first pop the front of the queue
+        if len(self.heap) >= self.capacity:
+            self.pop()
+
+        # Add the item
+        self.heap.append(container)
+        self.unique_items.add(item)
+
+    def pop(self):
+        """Pop the front of the queue (the oldest item)."""
+
+        if not self.heap:
+            return ()
+        score, item, extra_data = self.heap.pop(0)
+        self.unique_items.remove(item)
+        return (score, item, extra_data)
+
+
+# Adapted from https://github.com/tensorflow/models/blob/1af55e018eebce03fb61bba9959a04672536107d/research/brain_coder/common/utils.py
+class UniquePriorityQueue(Queue):
+    """A priority queue where duplicates are not added.
+
+    The top items by score remain in the queue. When the capacity is reached,
+    the lowest scored item in the queue will be dropped.
+    """
+
+    def push(self, score, item, extra_data=None):
+        """Push an item onto the queue.
+
+        If the queue is at capacity, the item with the smallest score will be
+        dropped. Note that it is assumed each item has exactly one score. The same
+        item with a different score will still be dropped.
+
+        Args:
+            score: Number used to prioritize items in the queue. Largest scores are
+                    kept in the queue.
+            item: A hashable item to be stored. Duplicates of this item will not be
+                    added to the queue.
+            extra_data: An extra (possible not hashable) data to store with the item.
+        """
+        if item in self.unique_items:
+            return
+        if len(self.heap) >= self.capacity:
+            _, popped_item, _ = heapq.heappushpop(
+                self.heap, ItemContainer(score, item, extra_data))
+            self.unique_items.add(item)
+            self.unique_items.remove(popped_item)
+        else:
+            heapq.heappush(self.heap, ItemContainer(score, item, extra_data))
+            self.unique_items.add(item)
+
+    def pop(self):
+        """Pop the item with the lowest score.
+
+        Returns:
+            score: Item's score.
+            item: The item that was popped.
+            extra_data: Any extra data stored with the item.
+        """
+        if not self.heap:
+            return ()
+        score, item, extra_data = heapq.heappop(self.heap)
+        self.unique_items.remove(item)
+        return score, item, extra_data
+
+    def get_max(self):
+        """Peek at the item with the highest score.
+
+        Returns:
+            Same as `pop`.
+        """
+        if not self.heap:
+            return ()
+        score, item, extra_data = heapq.nlargest(1, self.heap)[0]
+        return score, item, extra_data
+
+    def get_min(self):
+        """Peek at the item with the lowest score.
+
+        Returns:
+            Same as `pop`.
+        """
+        if not self.heap:
+            return ()
+        score, item, extra_data = heapq.nsmallest(1, self.heap)[0]
+        return score, item, extra_data
+
+    def iter_in_order(self):
+        """Iterate over items in the queue from largest score to smallest.
+
+        Yields:
+            item: Hashable item.
+            extra_data: Extra data stored with the item.
+        """
+        for _, item, extra_data in heapq.nlargest(len(self.heap), self.heap):
+            yield item, extra_data
+
+
+class ProgramQueueMixin():
+    """A mixin for Queues with additional utilities specific to Batch and
+    Program."""
+
+    def __init__(self, controller=None):
+        self.controller = controller
+
+    def push_sample(self, sample, program):
+        """
+        Push a single sample corresponding to Program to the queue.
+
+        Parameters
+        ----------
+        sample : Batch
+            A Batch comprising a single sample.
+
+        program : Program
+            Program corresponding to the sample.
+        """
+
+        id_ = program.str
+        score = sample.rewards
+        self.push(score, id_, sample)
+
+    def push_batch(self, batch, programs):
+        """Push a Batch corresponding to Programs to the queue."""
+
+        for i, program in enumerate(programs):
+            sample = get_samples(batch, i)
+            self.push_sample(sample, program)
+
+    def push_best(self, batch, programs):
+        """Push the single best sample from a Batch"""
+
+        i = np.argmax(batch.rewards)
+        sample = get_samples(batch, i)
+        program = programs[i]
+        self.push_sample(sample, program)
+
+    def sample_batch(self, sample_size):
+        """Randomly select items from the queue and return them as a Batch."""
+
+        assert len(self.heap) > 0, "Cannot sample from an empty queue."
+        samples = [sample for (id_, sample) in self.random_sample(sample_size)]
+        batch = self._make_batch(samples)
+        return batch
+
+    def _make_batch(self, samples):
+        """Turns a list of samples into a Batch."""
+
+        actions = np.stack([s.actions for s in samples], axis=0)
+        obs = tuple([np.stack([s.obs[i] for s in samples], axis=0) for i in range(3)])
+        priors = np.stack([s.priors for s in samples], axis=0)
+        lengths = np.array([s.lengths for s in samples], dtype=np.int32)
+        rewards = np.array([s.rewards for s in samples], dtype=np.float32)
+        batch = Batch(actions=actions, obs=obs, priors=priors,
+                      lengths=lengths, rewards=rewards)
+        return batch
+
+    def to_batch(self):
+        """Return the entire queue as a Batch."""
+
+        samples = [container.extra_data for container in self.heap]
+        batch = self._make_batch(samples)
+        return batch
+
+    def compute_probs(self):
+        """Computes the probabilities of items in the queue according to the
+        Controller."""
+
+        if self.controller is None:
+            raise RuntimeError("Cannot compute probabilities. This Queue does \
+                not have a Controller.")
+        return self.controller.compute_probs(self.to_batch())
+
+    def get_rewards(self):
+        """Returns the rewards"""
+
+        r = [container.extra_data.rewards for container in self.heap]
+        return r
diff --git a/dsr/dsr/prior.py b/dsr/dsr/prior.py
diff --git a/dsr/dsr/program.py b/dsr/dsr/program.py
diff --git a/dsr/dsr/run.py b/dsr/dsr/run.py
@@ -0,0 +1,224 @@
+"""Parallelized, single-point launch script to run DSR or GP on a set of benchmarks."""
+
+import warnings
+warnings.filterwarnings('ignore', category=DeprecationWarning)
+warnings.filterwarnings('ignore', category=FutureWarning)
+
+import os
+import sys
+import json
+import time
+from datetime import datetime
+import multiprocessing
+from functools import partial
+from pkg_resources import resource_filename
+import zlib
+
+import click
+import numpy as np
+import pandas as pd
+from sympy.parsing.sympy_parser import parse_expr
+from sympy import srepr
+
+from dsr import DeepSymbolicOptimizer
+from dsr.program import Program
+from dsr.task.regression.dataset import BenchmarkDataset
+from dsr.baselines import gpsr
+
+
+def train_dsr(name_and_seed, config):
+    """Trains DSR and returns dict of reward, expression, and traversal"""
+
+    # Override the benchmark name and output file
+    name, seed = name_and_seed
+    config["task"]["name"] = name
+    config["training"]["output_file"] = "dsr_{}_{}.csv".format(name, seed)
+
+    # Try importing TensorFlow (with suppressed warnings), Controller, and learn
+    # When parallelizing across tasks, these will already be imported, hence try/except
+    try:
+        os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
+        import tensorflow as tf
+        tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
+        from dsr.controller import Controller
+        from dsr.train import learn
+    except ModuleNotFoundError: # Specific subclass of ImportError for when module is not found, probably needs to be excepted first
+        print("One or more libraries not found")
+        raise ModuleNotFoundError
+    except ImportError:
+        # Have we already imported tf? If so, this is the error we want to dodge. 
+        if 'tf' in globals():
+            pass
+        else:
+            raise ImportError
+
+    # Train the model
+    model = DeepSymbolicOptimizer(config)
+    start = time.time()
+    result = {"name" : name, "seed" : seed} # Name and seed are listed first
+    result.update(model.train(seed=seed))
+    result["t"] = time.time() - start
+    result.pop("program")
+
+    return result
+
+
+def train_gp(name_and_seed, logdir, config_task, config_gp):
+    """Trains GP and returns dict of reward, expression, and program"""
+
+    name, seed = name_and_seed
+    config_gp["seed"] = seed + zlib.adler32(name.encode("utf-8"))
+
+    start = time.time()
+
+    # Load the dataset
+    config_dataset = config_task["dataset"]
+    config_dataset["name"] = name
+    dataset = BenchmarkDataset(**config_dataset)
+
+    # Fit the GP
+    gp = gpsr.GP(dataset=dataset, **config_gp)
+    p, logbook = gp.train()
+
+    # Retrieve results
+    r = base_r = p.fitness.values[0]
+    str_p = str(p)
+    nmse_test = gp.nmse_test(p)[0]
+    nmse_test_noiseless = gp.nmse_test_noiseless(p)[0]
+    success = gp.success(p)
+
+    # Many failure cases right now for converting to SymPy expression
+    try:
+        expression = repr(parse_expr(str_p.replace("X", "x").replace("add", "Add").replace("mul", "Mul")))
+    except:
+        expression = "N/A"
+
+    # Save run details
+    drop = ["gen", "nevals"]
+    df_fitness = pd.DataFrame(logbook.chapters["fitness"]).drop(drop, axis=1)
+    df_fitness = df_fitness.rename({"avg" : "fit_avg", "min" : "fit_min"}, axis=1)
+    df_fitness["fit_best"] = df_fitness["fit_min"].cummin()
+    df_len = pd.DataFrame(logbook.chapters["size"]).drop(drop, axis=1)
+    df_len = df_len.rename({"avg" : "l_avg"}, axis=1)
+    df = pd.concat([df_fitness, df_len], axis=1, sort=False)
+    df.to_csv(os.path.join(logdir, "gp_{}_{}.csv".format(name, seed)), index=False)
+
+    result = {
+        "name" : name,
+        "seed" : seed,
+        "r" : r,
+        "base_r" : base_r,
+        "nmse_test" : nmse_test,
+        "nmse_test_noiseless" : nmse_test_noiseless,
+        "success" : success,
+        "expression" : expression,
+        "traversal" : str_p,
+        "t" : time.time() - start
+    }
+
+    return result
+
+
+@click.command()
+@click.argument('config_template', default="config.json")
+@click.option('--method', default="dsr", type=click.Choice(["dsr", "gp"]), help="Symbolic regression method")
+@click.option('--mc', default=1, type=int, help="Number of Monte Carlo trials for each benchmark")
+@click.option('--output_filename', default=None, help="Filename to write results")
+@click.option('--n_cores_task', '--n', default=1, help="Number of cores to spread out across tasks")
+@click.option('--seed_shift', default=0, type=int, help="Integer to add to each seed (i.e. to combine multiple runs)")
+@click.option('--b', multiple=True, type=str, help="Name of benchmark or benchmark prefix")
+def main(config_template, method, mc, output_filename, n_cores_task, seed_shift, b):
+    """Runs DSR or GP on multiple benchmarks using multiprocessing."""
+
+    # Load the config file
+    with open(config_template, encoding='utf-8') as f:
+        config = json.load(f)
+
+    # Required configs
+    config_task = config["task"]            # Task specification parameters
+    config_training = config["training"]    # Training hyperparameters
+
+    # Optional configs
+    config_controller = config.get("controller")                        # Controller hyperparameters
+    config_language_model_prior = config.get("language_model_prior")    # Language model hyperparameters
+    config_gp = config.get("gp")                                        # GP hyperparameters
+
+    # Create output directories
+    if output_filename is None:
+        output_filename = "benchmark_{}.csv".format(method)
+    config_training["logdir"] = os.path.join(
+        config_training["logdir"],
+        "log_{}".format(datetime.now().strftime("%Y-%m-%d-%H%M%S")))
+    logdir = config_training["logdir"]
+    if "dataset" in config_task and "backup" in config_task["dataset"] and config_task["dataset"]["backup"]:
+        config_task["dataset"]["logdir"] = logdir
+    os.makedirs(logdir, exist_ok=True)
+    output_filename = os.path.join(logdir, output_filename)
+    # Use benchmark name from config if not specified as command-line arg
+    if len(b) == 0:
+        if isinstance(config_task["name"], str):
+            b = (config_task["name"],)
+        elif isinstance(config_task["name"], list):
+            b = tuple(config_task["name"])
+
+    # Shortcut to run all Nguyen benchmarks
+    benchmarks = list(b)
+    if "Nguyen" in benchmarks:
+        benchmarks.remove("Nguyen")
+        benchmarks += ["Nguyen-{}".format(i+1) for i in range(12)]
+
+    # Generate benchmark-seed pairs for each MC. When passed to the TF RNG,
+    # seeds will be added to checksums on the benchmark names
+    unique_benchmarks = benchmarks.copy()
+    benchmarks *= mc
+    seeds = (np.arange(mc) + seed_shift).repeat(len(unique_benchmarks)).tolist()
+    names_and_seeds = list(zip(benchmarks, seeds))
+
+    # Edit n_cores_task and/or n_cores_batch
+    if n_cores_task == -1:
+        n_cores_task = multiprocessing.cpu_count()
+    if n_cores_task > len(benchmarks):
+        print("Setting 'n_cores_task' to {} for batch because there are only {} benchmarks.".format(len(benchmarks), len(benchmarks)))
+        n_cores_task = len(benchmarks)
+    if method == "dsr":
+        if config_training["verbose"] and n_cores_task > 1:
+            print("Setting 'verbose' to False for parallelized run.")
+            config_training["verbose"] = False
+        if config_training["n_cores_batch"] != 1 and n_cores_task > 1:
+            print("Setting 'n_cores_batch' to 1 to avoid nested child processes.")
+            config_training["n_cores_batch"] = 1
+    print("Running {} for n={} on benchmarks {}".format(method, mc, unique_benchmarks))
+
+    # Write terminal command and config.json into log directory
+    cmd_filename = os.path.join(logdir, "cmd.out")
+    with open(cmd_filename, 'w') as f:
+        print(" ".join(sys.argv), file=f)
+    config_filename = os.path.join(logdir, "config.json")
+    with open(config_filename, 'w') as f:
+        json.dump(config, f, indent=4)
+
+    # Define the work
+    if method == "dsr":
+        work = partial(train_dsr, config=config)
+    elif method == "gp":
+        work = partial(train_gp, logdir=logdir, config_task=config_task, config_gp=config_gp)
+
+    # Farm out the work
+    write_header = True
+    if n_cores_task > 1:
+        pool = multiprocessing.Pool(n_cores_task)
+        for result in pool.imap_unordered(work, names_and_seeds):
+            pd.DataFrame(result, index=[0]).to_csv(output_filename, header=write_header, mode='a', index=False)
+            print("Completed {} ({} of {}) in {:.0f} s".format(result["name"], result["seed"]+1-seed_shift, mc, result["t"]))
+            write_header = False
+    else:
+        for name_and_seed in names_and_seeds:
+            result = work(name_and_seed)
+            pd.DataFrame(result, index=[0]).to_csv(output_filename, header=write_header, mode='a', index=False)
+            write_header = False
+
+    print("Results saved to: {}".format(output_filename))
+
+
+if __name__ == "__main__":
+    main()
diff --git a/dsr/dsr/subroutines.py b/dsr/dsr/subroutines.py
@@ -0,0 +1,120 @@
+"""Numba-compiled subroutines used for deep symbolic optimization."""
+
+from numba import jit, prange
+import numpy as np
+
+
+@jit(nopython=True, parallel=True)
+def parents_siblings(tokens, arities, parent_adjust):
+    """
+    Given a batch of action sequences, computes and returns the parents and
+    siblings of the next element of the sequence.
+
+    The batch has shape (N, L), where N is the number of sequences (i.e. batch
+    size) and L is the length of each sequence. In some cases, expressions may
+    already be complete; in these cases, this function sees the start of a new
+    expression, even though the return value for these elements won't matter
+    because their gradients will be zero because of sequence_length.
+
+    Parameters
+    __________
+
+    tokens : np.ndarray, shape=(N, L), dtype=np.int32
+        Batch of action sequences. Values correspond to library indices.
+
+    arities : np.ndarray, dtype=np.int32
+        Array of arities corresponding to library indices.
+
+    parent_adjust : np.ndarray, dtype=np.int32
+        Array of parent sub-library index corresponding to library indices.
+
+    Returns
+    _______
+
+    adj_parents : np.ndarray, shape=(N,), dtype=np.int32
+        Adjusted parents of the next element of each action sequence.
+
+    siblings : np.ndarray, shape=(N,), dtype=np.int32
+        Siblings of the next element of each action sequence.
+
+    """
+    N, L = tokens.shape
+
+    empty_parent = np.max(parent_adjust) + 1 # Empty token is after all non-empty tokens
+    empty_sibling = len(arities) # Empty token is after all non-empty tokens
+    adj_parents = np.full(shape=(N,), fill_value=empty_parent, dtype=np.int32)
+    siblings = np.full(shape=(N,), fill_value=empty_sibling, dtype=np.int32)
+    # Parallelized loop over action sequences
+    for r in prange(N):
+        arity = arities[tokens[r, -1]]
+        if arity > 0: # Parent is the previous element; no sibling
+            adj_parents[r] = parent_adjust[tokens[r, -1]]
+            continue
+        dangling = 0
+        # Loop over elements in an action sequence
+        for c in range(L):
+            arity = arities[tokens[r, L - c - 1]]
+            dangling += arity - 1
+            if dangling == 0: # Parent is L-c-1, sibling is the next
+                adj_parents[r] = parent_adjust[tokens[r, L - c - 1]]
+                siblings[r] = tokens[r, L - c]
+                break
+    return adj_parents, siblings
+
+
+@jit(nopython=True, parallel=True)
+def ancestors(actions, arities, ancestor_tokens):
+    """
+    Given a batch of action sequences, determines whether the next element of
+    the sequence has an ancestor in ancestor_tokens.
+
+    The batch has shape (N, L), where N is the number of sequences (i.e. batch
+    size) and L is the length of each sequence. In some cases, expressions may
+    already be complete; in these cases, this function sees the start of a new
+    expression, even though the return value for these elements won't matter
+    because their gradients will be zero because of sequence_length.
+
+    Parameters
+    __________
+
+    actions : np.ndarray, shape=(N, L), dtype=np.int32
+        Batch of action sequences. Values correspond to library indices.
+
+    arities : np.ndarray, dtype=np.int32
+        Array of arities corresponding to library indices.
+
+    ancestor_tokens : np.ndarray, dtype=np.int32
+        Array of ancestor library indices to check.
+
+    Returns
+    _______
+
+    mask : np.ndarray, shape=(N,), dtype=np.bool_
+        Mask of whether the next element of each sequence has an ancestor in
+        ancestor_tokens.
+    """
+
+    N, L = actions.shape
+    mask = np.zeros(shape=(N,), dtype=np.bool_)
+    # Parallelized loop over action sequences
+    for r in prange(N):
+        dangling = 0
+        threshold = None # If None, current branch does not have trig ancestor
+        for c in range(L):
+            arity = arities[actions[r, c]]
+            dangling += arity - 1
+            # Turn "on" if a trig function is found
+            # Remain "on" until branch completes
+            if threshold is None:
+                for trig_token in ancestor_tokens:
+                    if actions[r, c] == trig_token:
+                        threshold = dangling - 1
+                        break
+            # Turn "off" once the branch completes
+            else:
+                if dangling == threshold:
+                    threshold = None
+        # If the sequences ended "on", then there is a trig ancestor
+        if threshold is not None:
+            mask[r] = True
+    return mask
diff --git a/dsr/dsr/task/__init__.py b/dsr/dsr/task/__init__.py
@@ -0,0 +1 @@
+from dsr.task.task import make_task, set_task, Task
diff --git a/dsr/dsr/task/regression/__init__.py b/dsr/dsr/task/regression/__init__.py
diff --git a/dsr/dsr/task/regression/benchmarks.csv b/dsr/dsr/task/regression/benchmarks.csv
@@ -0,0 +1,38 @@
+name,variables,expression,train_spec,test_spec,function_set
+Nguyen-1,1,"pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-2,1,"pow(x1,4)+pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-3,1,"pow(x1,5)+pow(x1,4)+pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-4,1,"pow(x1,6)+pow(x1,5)+pow(x1,4)+pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-5,1,"sin(pow(x1,2))*cos(x1)-1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-6,1,"sin(x1)+sin(x1+pow(x1,2))","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-7,1,"log(x1+1)+log(pow(x1,2)+1)","{""all"":{""U"":[0,2,20]}}",None,Koza
+Nguyen-8,1,sqrt(x1),"{""all"":{""U"":[0,4,20]}}",None,Koza
+Nguyen-9,2,"sin(x1)+sin(pow(x2,2))","{""all"":{""U"":[0,1,20]}}",None,Koza
+Nguyen-10,2,2*sin(x1)*cos(x2),"{""all"":{""U"":[0,1,20]}}",None,Koza
+Nguyen-11,2,"pow(x1,x2)","{""all"":{""U"":[0,1,20]}}",None,Koza
+Nguyen-12,2,"pow(x1,4)-pow(x1,3)+div(pow(x2,2),2)-x2","{""all"":{""U"":[0,1,20]}}",None,Koza
+Nguyen-2a,1,"4*pow(x1,4)+3*pow(x1,3)+2*pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-5a,1,"sin(pow(x1,2))*cos(x1)-2","{""all"":{""U"":[-1,1,20]}}",None,Koza
+Nguyen-8a,1,"pow(x1,1/3)","{""all"":{""U"":[0,4,20]}}",None,Koza
+Nguyen-8aa,1,"pow(x1,2/3)","{""all"":{""U"":[0,4,20]}}",None,Koza
+Nguyen-1c,1,"3.39*pow(x1,3)+2.12*pow(x1,2)+1.78*x1","{""all"":{""U"":[-1,1,20]}}",None,CKoza
+Nguyen-5c,1,"sin(pow(x1,2))*cos(x1)-0.75","{""all"":{""U"":[-1,1,20]}}",None,CKoza
+Nguyen-7c,1,"log(x1+1.4)+log(pow(x1,2)+1.3)","{""all"":{""U"":[0,2,20]}}",None,CKoza
+Nguyen-8c,1,sqrt(1.23*x1),"{""all"":{""U"":[0,4,20]}}",None,CKoza
+Nguyen-10c,2,sin(1.5*x1)*cos(0.5*x2),"{""all"":{""U"":[0,1,20]}}",None,CKoza
+GrammarVAE-1,1,"1./3+x1+sin(pow(x1,2))","{""all"":{""E"":[-10,10,1000]}}",None,GrammarVAE
+Jin-1,2,"2.5*pow(x1,4)-1.3*pow(x1,3)+0.5*pow(x2,2)-1.7*x2","{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Jin-2,2,"8.0*pow(x1,2)+8.0*pow(x2,3)-15.0","{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Jin-3,2,"0.2*pow(x1,3)+0.5*pow(x2,3)-1.2*x2-0.5*x1","{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Jin-4,2,1.5*exp(x1)+5.0*cos(x2),"{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Jin-5,2,6.0*sin(x1)*cos(x2),"{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Jin-6,2,1.35*x1*x2+5.5*sin((x1-1.0)*(x2-1.0)),"{""all"":{""U"":[-3.0,3.0,100]}}","{""all"":{""U"":[-3.0,3.0,30]}}",Jin
+Neat-1,1,"pow(x1,4)+pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,KozaPlus1
+Neat-2,1,"pow(x1,5)+pow(x1,4)+pow(x1,3)+pow(x1,2)+x1","{""all"":{""U"":[-1,1,20]}}",None,KozaPlus1
+Neat-3,1,"sin(pow(x1,2))*cos(x1)-1","{""all"":{""U"":[-1,1,20]}}",None,KozaPlus1
+Neat-4,1,"log(x1+1)+log(pow(x1,2)+1)","{""all"":{""U"":[0,2,20]}}",None,KozaPlus1
+Neat-5,2,2*sin(x1)*cos(x2),"{""all"":{""U"":[-1,1,100]}}",None,Koza
+Neat-6,1,harmonic(x1),"{""all"":{""E"":[1,50,1]}}","{""all"":{""E"":[1,120,1]}}",KeijzerPlus1
+Neat-7,2,2-2.1*cos(9.8*x1)*sin(1.3*x2),"{""all"":{""U"":[-50,50,10000]}}",None,Korns
+Neat-8,2,"div(exp(-pow(x1-1,2)),(1.2+pow((x2-2.5),2)))","{""all"":{""U"":[0.3,4,100]}}",None,Vladislavleva-B
+Neat-9,2,"div(1,(1+pow(x1,-4)))+div(1,(1+pow(x2,-4)))","{""all"":{""E"":[-5,5,0.4]}}",None,Koza
diff --git a/dsr/dsr/task/regression/dataset.py b/dsr/dsr/task/regression/dataset.py
@@ -0,0 +1,274 @@
+"""Class for deterministically generating a benchmark dataset from benchmark specifications."""
+
+import os
+import ast
+import itertools
+from pkg_resources import resource_filename
+import zlib
+
+import click
+import pandas as pd
+import numpy as np
+
+from dsr.functions import function_map
+
+
+class BenchmarkDataset(object):
+    """
+    Class used to generate (X, y) data from a named benchmark expression.
+
+    Parameters
+    ----------
+    name : str
+        Name of benchmark expression.
+
+    benchmark_source : str, optional
+        Filename of CSV describing benchmark expressions.
+
+    root : str, optional
+        Directory containing benchmark_source and function_sets.csv.
+
+    noise : float, optional
+        If not None, Gaussian noise is added to the y values with standard
+        deviation = noise * RMS of the noiseless y training values.
+
+    dataset_size_multiplier : float, optional
+        Multiplier for size of the dataset.
+
+    seed : int, optional
+        Random number seed used to generate data. Checksum on name is added to
+        seed.
+
+    logdir : str, optional
+        Directory where experiment logfiles are saved.
+
+    backup : bool, optional
+        Save generated dataset in logdir if logdir is provided.
+    """
+
+    def __init__(self, name, benchmark_source="benchmarks.csv", root=None, noise=0.0,
+                 dataset_size_multiplier=1.0, seed=0, logdir=None,
+                 backup=False):
+        # Set class variables
+        self.name = name
+        self.seed = seed
+        self.noise = noise if noise is not None else 0.0
+        self.dataset_size_multiplier = dataset_size_multiplier if dataset_size_multiplier is not None else 1.0
+
+        # Set random number generator used for sampling X values
+        seed += zlib.adler32(name.encode("utf-8")) # Different seed for each name, otherwise two benchmarks with the same domain will always have the same X values
+        self.rng = np.random.RandomState(seed)
+
+        # Load benchmark data
+        if root is None:
+            root = resource_filename("dsr.task", "regression")
+        benchmark_path = os.path.join(root, benchmark_source)
+        benchmark_df = pd.read_csv(benchmark_path, index_col=0, encoding="ISO-8859-1")
+        row = benchmark_df.loc[name]
+        self.n_input_var = row["variables"]
+
+        # Create symbolic expression
+        self.numpy_expr = self.make_numpy_expr(row["expression"])
+
+        # Create X values
+        train_spec = ast.literal_eval(row["train_spec"])
+        test_spec = ast.literal_eval(row["test_spec"])
+        if test_spec is None:
+            test_spec = train_spec
+        self.X_train = self.make_X(train_spec)
+        self.X_test = self.make_X(test_spec)
+        self.train_spec = train_spec
+        self.test_spec = test_spec
+
+        # Compute y values
+        self.y_train = self.numpy_expr(self.X_train)
+        self.y_test = self.numpy_expr(self.X_test)
+        self.y_train_noiseless = self.y_train.copy()
+        self.y_test_noiseless = self.y_test.copy()
+
+        # Add Gaussian noise
+        if self.noise > 0:
+            y_rms = np.sqrt(np.mean(self.y_train**2))
+            scale = self.noise * y_rms
+            self.y_train += self.rng.normal(loc=0, scale=scale, size=self.y_train.shape)
+            self.y_test += self.rng.normal(loc=0, scale=scale, size=self.y_test.shape)
+        elif self.noise < 0:
+            print('WARNING: Ignoring negative noise value: {}'.format(self.noise))
+
+        # Load default function set
+        function_set_path = os.path.join(root, "function_sets.csv")
+        function_set_df = pd.read_csv(function_set_path, index_col=0)
+        function_set_name = row["function_set"]
+        self.function_set = function_set_df.loc[function_set_name].tolist()[0].strip().split(',')
+
+        # Prepare status output
+        output_message = '\n-- Building dataset -----------------\n'
+        output_message += 'Benchmark path                 : {}\n'.format(benchmark_path)
+        output_message += 'Generated data for benchmark   : {}\n'.format(name)
+        output_message += 'Function set path              : {}\n'.format(function_set_path)
+        output_message += 'Function set                   : {} --> {}\n'.format(function_set_name, self.function_set)
+        if backup and logdir is not None:
+            output_message += self.save(logdir)
+        output_message += '-------------------------------------\n\n'
+        print(output_message)
+
+    def make_X(self, spec):
+        """Creates X values based on specification"""
+
+        features = []
+        for i in range(1, self.n_input_var + 1):
+
+            # Hierarchy: "all" --> "x{}".format(i)
+            input_var = "x{}".format(i)
+            if "all" in spec:
+                input_var = "all"
+            elif input_var not in spec:
+                input_var = "x1"
+
+            if "U" in spec[input_var]:
+                low, high, n = spec[input_var]["U"]
+                n = int(n * self.dataset_size_multiplier)
+                feature = self.rng.uniform(low=low, high=high, size=n)
+            elif "E" in spec[input_var]:
+                start, stop, step = spec[input_var]["E"]
+                if step > stop - start:
+                    n = step
+                else:
+                    n = int((stop - start)/step) + 1
+                n = int(n * self.dataset_size_multiplier)
+                feature = np.linspace(start=start, stop=stop, num=n, endpoint=True)
+            else:
+                raise ValueError("Did not recognize specification for {}: {}.".format(input_var, spec[input_var]))
+            features.append(feature)
+
+        # Do multivariable combinations
+        if "E" in spec[input_var] and self.n_input_var > 1:
+            X = np.array(list(itertools.product(*features)))
+        else:
+            X = np.column_stack(features)
+
+        return X
+
+    def make_numpy_expr(self, s):
+        # This isn't pretty, but unlike sympy's lambdify, this ensures we use
+        # our protected functions. Otherwise, some expressions may have large
+        # error even if the functional form is correct due to the training set
+        # not using protected functions.
+
+        # Replace function names
+        s = s.replace("ln(", "log(")
+        s = s.replace("pi", "np.pi")
+        s = s.replace("pow", "np.power")
+        for k in function_map.keys():
+            s = s.replace(k + '(', "function_map['{}'].function(".format(k))
+
+        # Replace variable names
+        for i in reversed(range(self.n_input_var)):
+            old = "x{}".format(i+1)
+            new = "x[:, {}]".format(i)
+            s = s.replace(old, new)
+
+        numpy_expr = lambda x : eval(s)
+
+        return numpy_expr
+
+    def save(self, logdir='./'):
+        save_path = os.path.join(logdir,'data_{}_n{:.2f}_d{:.0f}_s{}.csv'.format(
+                self.name, self.noise, self.dataset_size_multiplier, self.seed))
+        try:
+            os.makedirs(logdir, exist_ok=True)
+            np.savetxt(
+                save_path,
+                np.concatenate(
+                    (
+                        np.hstack((self.X_train, self.y_train[..., np.newaxis])),
+                        np.hstack((self.X_test, self.y_test[..., np.newaxis]))
+                    ), axis=0),
+                delimiter=',', fmt='%1.5f'
+            )
+            return 'Saved dataset to               : {}\n'.format(save_path)
+        except:
+            import sys
+            e = sys.exc_info()[0]
+            print("WARNING: Could not save dataset: {}".format(e))
+
+    def plot(self, logdir='./'):
+        """Plot Dataset with underlying ground truth."""
+        if self.X_train.shape[1] == 1:
+            from matplotlib import pyplot as plt
+            save_path = os.path.join(logdir,'plot_{}_n{:.2f}_d{:.0f}_s{}.png'.format(
+                    self.name, self.noise, self.dataset_size_multiplier, self.seed))
+
+            # Draw ground truth expression
+            bounds = list(list(self.train_spec.values())[0].values())[0][:2]
+            x = np.linspace(bounds[0], bounds[1], endpoint=True, num=100)
+            y = self.numpy_expr(x[:, None])
+            plt.plot(x, y, color='red', linestyle='dashed')
+            # Draw the actual points
+            plt.scatter(self.X_train, self.y_train)
+            # Add a title
+            plt.title(
+                "{} N:{} M:{} S:{}".format(
+                    self.name, self.noise, self.dataset_size_multiplier, self.seed),
+                fontsize=7)
+            try:
+                os.makedirs(logdir, exist_ok=True)
+                plt.savefig(save_path)
+                print('Saved plot to                  : {}'.format(save_path))
+            except:
+                import sys
+                e = sys.exc_info()[0]
+                print("WARNING: Could not plot dataset: {}".format(e))
+            plt.close()
+        else:
+            print("WARNING: Plotting only supported for 2D datasets.")
+
+
+@click.command()
+@click.argument("benchmark_source", default="benchmarks.csv")
+@click.option('--plot', is_flag=True)
+@click.option('--save_csv', is_flag=True)
+@click.option('--sweep', is_flag=True)
+def main(benchmark_source, plot, save_csv, sweep):
+    """Plots all benchmark expressions."""
+
+    regression_path = resource_filename("dsr.task", "regression/")
+    benchmark_path = os.path.join(regression_path, benchmark_source)
+    save_dir = os.path.join(regression_path, 'log')
+    df = pd.read_csv(benchmark_path, encoding="ISO-8859-1")
+    names = df["name"].to_list()
+    for name in names:
+
+        if not name.startswith("Nguyen") and not name.startswith("Constant") and not name.startswith("Custom"):
+            continue
+
+        datasets = []
+
+        # Noiseless
+        d = BenchmarkDataset(
+            name=name,
+            benchmark_source=benchmark_source)
+        datasets.append(d)
+
+        # Generate all combinations of noise levels and dataset size multipliers
+        if sweep and name.startswith("Nguyen"):
+            noises = [0.0, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10]
+            dataset_size_multipliers = [1.0, 10.0]
+            for noise in noises:
+                for dataset_size_multiplier in dataset_size_multipliers:
+                    d = BenchmarkDataset(
+                        name=name,
+                        benchmark_source=benchmark_source,
+                        noise=noise,
+                        dataset_size_multiplier=dataset_size_multiplier,
+                        backup=save_csv,
+                        logdir=save_dir)
+                    datasets.append(d)
+
+        # Plot and/or save datasets
+        for dataset in datasets:
+            if plot and dataset.X_train.shape[1] == 1:
+                dataset.plot(save_dir)
+
+if __name__ == "__main__":
+    main()
diff --git a/dsr/dsr/task/regression/function_sets.csv b/dsr/dsr/task/regression/function_sets.csv
@@ -0,0 +1,13 @@
+name,function_set
+Koza,"add,sub,mul,div,sin,cos,exp,log"
+CKoza,"add,sub,mul,div,sin,cos,exp,log,const"
+KozaPlus1,"add,sub,mul,div,sin,cos,exp,log,1.0"
+Korns,"add,sub,mul,div,sin,cos,exp,log,n2,n3,sqrt,tan,tanh,const"
+Keijzer,"add,mul,inv,neg,sqrt,const"
+KeijzerPlus1,"add,mul,inv,neg,sqrt,1.0,const"
+Vladislavleva-A,"add,sub,mul,div,n2"
+Vladislavleva-B,"add,sub,mul,div,n2,exp,expneg"
+Vladislavleva-C,"add,sub,mul,div,n2,exp,expneg,sin,cos"
+None,"add,sub,mul,div,sin,cos,exp,log"
+Jin,"add,sub,mul,div,sin,cos,exp,n2,n3,const"
+GrammarVAE,"add,mul,div,sin,exp,1.0,2.0,3.0"
diff --git a/dsr/dsr/task/regression/regression.py b/dsr/dsr/task/regression/regression.py
@@ -0,0 +1,352 @@
+import numpy as np
+import pandas as pd
+
+import dsr
+from dsr.library import Library
+from dsr.functions import create_tokens
+from dsr.task.regression.dataset import BenchmarkDataset
+
+
+def make_regression_task(name, function_set, dataset, metric="inv_nrmse",
+    metric_params=(1.0,), extra_metric_test=None, extra_metric_test_params=(),
+    reward_noise=0.0, reward_noise_type="r", threshold=1e-12,
+    normalize_variance=False, protected=False):
+    """
+    Factory function for regression rewards. This includes closures for a
+    dataset and regression metric (e.g. inverse NRMSE). Also sets regression-
+    specific metrics to be used by Programs.
+
+    Parameters
+    ----------
+    name : str or None
+        Name of regression benchmark, if using benchmark dataset.
+
+    function_set : list or None
+        List of allowable functions. If None, uses function_set according to
+        benchmark dataset.
+
+    dataset : dict, str, or tuple
+        If dict: .dataset.BenchmarkDataset kwargs.
+        If str: filename of dataset.
+        If tuple: (X, y) data
+
+    metric : str
+        Name of reward function metric to use.
+
+    metric_params : list
+        List of metric-specific parameters.
+
+    extra_metric_test : str
+        Name of extra function metric to use for testing.
+
+    extra_metric_test_params : list
+        List of metric-specific parameters for extra test metric.
+
+    reward_noise : float
+        Noise level to use when computing reward.
+
+    reward_noise_type : "y_hat" or "r"
+        "y_hat" : N(0, reward_noise * y_rms_train) is added to y_hat values.
+        "r" : N(0, reward_noise) is added to r.
+
+    normalize_variance : bool
+        If True and reward_noise_type=="r", reward is multiplied by
+        1 / sqrt(1 + 12*reward_noise**2) (We assume r is U[0,1]).
+
+    protected : bool
+        Whether to use protected functions.
+
+    threshold : float
+        Threshold of NMSE on noiseless data used to determine success.
+
+    Returns
+    -------
+
+    task : Task
+        Dynamically created Task object whose methods contains closures.
+    """
+
+    X_test = y_test = y_test_noiseless = None
+
+    # Benchmark dataset config
+    if isinstance(dataset, dict):
+        dataset["name"] = name
+        benchmark = BenchmarkDataset(**dataset)
+        X_train = benchmark.X_train
+        y_train = benchmark.y_train
+        X_test = benchmark.X_test
+        y_test = benchmark.y_test
+        y_test_noiseless = benchmark.y_test_noiseless
+
+        # Unless specified, use the benchmark's default function_set
+        if function_set is None:
+            function_set = benchmark.function_set
+
+    # Dataset filename
+    elif isinstance(dataset, str):
+        df = pd.read_csv(dataset, header=None) # Assuming data file does not have header rows
+        X_train = df.values[:, :-1]
+        y_train = df.values[:, -1]
+
+    # sklearn-like (X, y) data
+    elif isinstance(dataset, tuple):
+        X_train = dataset[0]
+        y_train = dataset[1]
+
+    if X_test is None:
+        X_test = X_train
+        y_test = y_train
+        y_test_noiseless = y_test
+
+    if function_set is None:
+        print("WARNING: Function set not provided. Using default set.")
+        function_set = ["add", "sub", "mul", "div", "sin", "cos", "exp", "log"]
+
+    # Save time by only computing these once
+    var_y_test = np.var(y_test)
+    var_y_test_noiseless = np.var(y_test_noiseless)
+
+    # Define closures for metric
+    metric, invalid_reward, max_reward = make_regression_metric(metric, y_train, *metric_params)
+    if extra_metric_test is not None:
+        print("Setting extra test metric to {}.".format(extra_metric_test))
+        metric_test, _, _ = make_regression_metric(extra_metric_test, y_test, *extra_metric_test_params) 
+    assert reward_noise >= 0.0, "Reward noise must be non-negative."
+    if reward_noise:
+        assert reward_noise_type in ["y_hat", "r"], "Reward noise type not recognized."
+        rng = np.random.RandomState(0)
+        y_rms_train = np.sqrt(np.mean(y_train ** 2))
+        if reward_noise_type == "y_hat":
+            scale = reward_noise * y_rms_train
+        elif reward_noise_type == "r":
+            scale = reward_noise
+
+    def reward(p):
+
+        # Compute estimated values
+        y_hat = p.execute(X_train)
+
+        # For invalid expressions, return invalid_reward
+        if p.invalid:
+            return invalid_reward
+
+        ### Observation noise
+        # For reward_noise_type == "y_hat", success must always be checked to 
+        # ensure success cases aren't overlooked due to noise. If successful,
+        # return max_reward.
+        if reward_noise and reward_noise_type == "y_hat":
+            if p.evaluate.get("success"):
+                return max_reward
+            y_hat += rng.normal(loc=0, scale=scale, size=y_hat.shape)
+
+        # Compute metric
+        r = metric(y_train, y_hat)
+
+        ### Direct reward noise
+        # For reward_noise_type == "r", success can for ~max_reward metrics be
+        # confirmed before adding noise. If successful, must return np.inf to
+        # avoid overlooking success cases.
+        if reward_noise and reward_noise_type == "r":
+            if r >= max_reward - 1e-5 and p.evaluate.get("success"):
+                return np.inf
+            r += rng.normal(loc=0, scale=scale)
+            if normalize_variance:
+                r /= np.sqrt(1 + 12*scale**2)
+
+        return r
+
+
+    def evaluate(p):
+
+        # Compute predictions on test data
+        y_hat = p.execute(X_test)
+        if p.invalid:
+            nmse_test = None
+            nmse_test_noiseless = None
+            success = False
+
+        else:
+            # NMSE on test data (used to report final error)
+            nmse_test = np.mean((y_test - y_hat)**2) / var_y_test
+
+            # NMSE on noiseless test data (used to determine recovery)
+            nmse_test_noiseless = np.mean((y_test_noiseless - y_hat)**2) / var_y_test_noiseless
+
+            # Success is defined by NMSE on noiseless test data below a threshold
+            success = nmse_test_noiseless < threshold
+
+        info = {
+            "nmse_test" : nmse_test,
+            "nmse_test_noiseless" : nmse_test_noiseless,
+            "success" : success
+        }
+
+        if extra_metric_test is not None:
+            if p.invalid:
+                m_test = None
+                m_test_noiseless = None
+            else:
+                m_test = metric_test(y_test, y_hat)
+                m_test_noiseless = metric_test(y_test_noiseless, y_hat)     
+
+            info.update(
+                {
+                extra_metric_test : m_test,
+                extra_metric_test + '_noiseless' : m_test_noiseless
+                }
+            )
+
+        return info
+
+    tokens = create_tokens(n_input_var=X_train.shape[1],
+                           function_set=function_set,
+                           protected=protected)
+    library = Library(tokens)
+
+    stochastic = reward_noise > 0.0
+
+    extra_info = {}
+
+    task = dsr.task.Task(reward_function=reward,
+                evaluate=evaluate,
+                library=library,
+                stochastic=stochastic,
+                extra_info=extra_info)
+
+    return task
+
+
+def make_regression_metric(name, y_train, *args):
+    """
+    Factory function for a regression metric. This includes a closures for
+    metric parameters and the variance of the training data.
+
+    Parameters
+    ----------
+
+    name : str
+        Name of metric. See all_metrics for supported metrics.
+
+    args : args
+        Metric-specific parameters
+
+    Returns
+    -------
+
+    metric : function
+        Regression metric mapping true and estimated values to a scalar.
+
+    invalid_reward: float or None
+        Reward value to use for invalid expression. If None, the training
+        algorithm must handle it, e.g. by rejecting the sample.
+
+    max_reward: float
+        Maximum possible reward under this metric.
+    """
+
+    var_y = np.var(y_train)
+
+    all_metrics = {
+
+        # Negative mean squared error
+        # Range: [-inf, 0]
+        # Value = -var(y) when y_hat == mean(y)
+        "neg_mse" :     (lambda y, y_hat : -np.mean((y - y_hat)**2),
+                        0),
+
+        # Negative root mean squared error
+        # Range: [-inf, 0]
+        # Value = -sqrt(var(y)) when y_hat == mean(y)
+        "neg_rmse" :     (lambda y, y_hat : -np.sqrt(np.mean((y - y_hat)**2)),
+                        0),
+
+        # Negative normalized mean squared error
+        # Range: [-inf, 0]
+        # Value = -1 when y_hat == mean(y)
+        "neg_nmse" :    (lambda y, y_hat : -np.mean((y - y_hat)**2)/var_y,
+                        0),
+
+        # Negative normalized root mean squared error
+        # Range: [-inf, 0]
+        # Value = -1 when y_hat == mean(y)
+        "neg_nrmse" :   (lambda y, y_hat : -np.sqrt(np.mean((y - y_hat)**2)/var_y),
+                        0),
+
+        # (Protected) negative log mean squared error
+        # Range: [-inf, 0]
+        # Value = -log(1 + var(y)) when y_hat == mean(y)
+        "neglog_mse" : (lambda y, y_hat : -np.log(1 + np.mean((y - y_hat)**2)),
+                        0),
+
+        # (Protected) inverse mean squared error
+        # Range: [0, 1]
+        # Value = 1/(1 + args[0]*var(y)) when y_hat == mean(y)
+        "inv_mse" : (lambda y, y_hat : 1/(1 + args[0]*np.mean((y - y_hat)**2)),
+                        1),
+
+        # (Protected) inverse normalized mean squared error
+        # Range: [0, 1]
+        # Value = 1/(1 + args[0]) when y_hat == mean(y)
+        "inv_nmse" :    (lambda y, y_hat : 1/(1 + args[0]*np.mean((y - y_hat)**2)/var_y),
+                        1),
+
+        # (Protected) inverse normalized root mean squared error
+        # Range: [0, 1]
+        # Value = 1/(1 + args[0]) when y_hat == mean(y)
+        "inv_nrmse" :    (lambda y, y_hat : 1/(1 + args[0]*np.sqrt(np.mean((y - y_hat)**2)/var_y)),
+                        1),
+
+        # Fraction of predicted points within p0*abs(y) + p1 band of the true value
+        # Range: [0, 1]
+        "fraction" :    (lambda y, y_hat : np.mean(abs(y - y_hat) < args[0]*abs(y) + args[1]),
+                        2),
+
+        # Pearson correlation coefficient
+        # Range: [0, 1]
+        "pearson" :     (lambda y, y_hat : scipy.stats.pearsonr(y, y_hat)[0],
+                        0),
+
+        # Spearman correlation coefficient
+        # Range: [0, 1]
+        "spearman" :    (lambda y, y_hat : scipy.stats.spearmanr(y, y_hat)[0],
+                        0)
+    }
+
+    assert name in all_metrics, "Unrecognized reward function name."
+    assert len(args) == all_metrics[name][1], "For {}, expected {} reward function parameters; received {}.".format(name,all_metrics[name][1], len(args))
+    metric = all_metrics[name][0]
+
+    # For negative MSE-based rewards, invalid reward is the value of the reward function when y_hat = mean(y)
+    # For inverse MSE-based rewards, invalid reward is 0.0
+    # For non-MSE-based rewards, invalid reward is the minimum value of the reward function's range
+    all_invalid_rewards = {
+        "neg_mse" : -var_y,
+        "neg_rmse" : -np.sqrt(var_y),
+        "neg_nmse" : -1.0,
+        "neg_nrmse" : -1.0,
+        "neglog_mse" : -np.log(1 + var_y),
+        "inv_mse" : 0.0, #1/(1 + args[0]*var_y),
+        "inv_nmse" : 0.0, #1/(1 + args[0]),
+        "inv_nrmse" : 0.0, #1/(1 + args[0]),
+        "fraction" : 0.0,
+        "pearson" : 0.0,
+        "spearman" : 0.0
+    }
+    invalid_reward = all_invalid_rewards[name]
+
+    all_max_rewards = {
+        "neg_mse" : 0.0,
+        "neg_rmse" : 0.0,
+        "neg_nmse" : 0.0,
+        "neg_nrmse" : 0.0,
+        "neglog_mse" : 0.0,
+        "inv_mse" : 1.0,
+        "inv_nmse" : 1.0,
+        "inv_nrmse" : 1.0,
+        "fraction" : 1.0,
+        "pearson" : 1.0,
+        "spearman" : 1.0
+    }
+    max_reward = all_max_rewards[name]
+
+    return metric, invalid_reward, max_reward
diff --git a/dsr/dsr/task/regression/sklearn.py b/dsr/dsr/task/regression/sklearn.py
@@ -0,0 +1,35 @@
+from copy import deepcopy
+
+from sklearn.base import BaseEstimator, RegressorMixin
+from sklearn.utils.validation import check_is_fitted
+
+from dsr import DeepSymbolicOptimizer
+
+
+class DeepSymbolicRegressor(DeepSymbolicOptimizer,
+                            BaseEstimator, RegressorMixin):
+    """
+    Sklearn interface for deep symbolic regression.
+    """
+
+    def __init__(self, config=None):
+        DeepSymbolicOptimizer.__init__(self, config)
+
+    def fit(self, X, y):
+
+        # Update the Task
+        config = deepcopy(self.config)
+        config["task"]["task_type"] = "regression"
+        config["task"]["dataset"] = (X, y)
+        self.update_config(config)
+
+        train_result = self.train()
+        self.program_ = train_result["program"]
+
+        return self
+
+    def predict(self, X):
+
+        check_is_fitted(self, "program_")
+
+        return self.program_.execute(X)
diff --git a/dsr/dsr/task/regression/test_sklearn.py b/dsr/dsr/task/regression/test_sklearn.py
@@ -0,0 +1,24 @@
+"""Tests for sklearn interface."""
+
+import pytest
+import numpy as np
+
+from dsr import DeepSymbolicRegressor
+from dsr.test.generate_test_data import CONFIG_TRAINING_OVERRIDE
+
+
+@pytest.fixture
+def model():
+    return DeepSymbolicRegressor("config.json")
+
+
+def test_task(model):
+    """Test regression for various configs."""
+
+    # Generate some data
+    np.random.seed(0)
+    X = np.random.random(size=(10, 3))
+    y = np.random.random(size=(10,))
+
+    model.config_training.update(CONFIG_TRAINING_OVERRIDE)
+    model.fit(X, y)
diff --git a/dsr/dsr/task/task.py b/dsr/dsr/task/task.py
@@ -0,0 +1,86 @@
+"""Factory functions for generating symbolic search tasks."""
+
+from dataclasses import dataclass
+from typing import Callable, List, Dict, Any
+
+from dsr.task.regression.regression import make_regression_task
+from dsr.program import Program
+from dsr.library import Library
+
+
+@dataclass(frozen=True)
+class Task:
+    """
+    Data object specifying a symbolic search task.
+
+    Attributes
+    ----------
+    reward_function : function
+        Reward function mapping program.Program object to scalar. Includes
+        test argument for train vs test evaluation.
+
+    eval_function : function
+        Evaluation function mapping program.Program object to a dict of task-
+        specific evaluation metrics (primitives). Special optional key "success"
+        is used for determining early stopping during training.
+
+    library : Library
+        Library of Tokens.
+
+    stochastic : bool
+        Whether the reward function of the task is stochastic.
+
+    extra_info : dict
+        Extra task-specific info, e.g. reference to symbolic policies for
+        control task.
+    """
+
+    reward_function: Callable[[Program], float]
+    evaluate: Callable[[Program], float]
+    library: Library
+    stochastic: bool
+    extra_info: Dict[str, Any]
+
+
+def make_task(task_type, **config_task):
+    """
+    Factory function for Task object.
+
+    Parameters
+    ----------
+
+    task_type : str
+        Type of task:
+        "regression" : Symbolic regression task.
+
+    config_task : kwargs
+        Task-specific arguments. See specifications of task_dict. Special key
+        "name" is required, which defines the benchmark (i.e. dataset for
+        regression).
+
+    Returns
+    -------
+
+    task : Task
+        Task object.
+    """
+
+    # Dictionary from task name to task factory function
+    task_dict = {
+        "regression" : make_regression_task,
+    }
+
+    task = task_dict[task_type](**config_task)
+    return task
+
+
+def set_task(config_task):
+    """Helper function to make set the Program class Task and execute function
+    from task config."""
+
+    # Use of protected functions is the same for all tasks, so it's handled separately
+    protected = config_task["protected"] if "protected" in config_task else False
+
+    Program.set_execute(protected)
+    task = make_task(**config_task)
+    Program.set_task(task)
diff --git a/dsr/dsr/test/__init__.py b/dsr/dsr/test/__init__.py
diff --git a/dsr/dsr/test/data/test_model.data-00000-of-00001 b/dsr/dsr/test/data/test_model.data-00000-of-00001
diff --git a/dsr/dsr/test/data/test_model.index b/dsr/dsr/test/data/test_model.index
diff --git a/dsr/dsr/test/data/test_model.meta b/dsr/dsr/test/data/test_model.meta
diff --git a/dsr/dsr/test/generate_test_data.py b/dsr/dsr/test/generate_test_data.py
@@ -0,0 +1,28 @@
+"""Generate model parity test case data for DeepSymbolicOptimizer."""
+
+from pkg_resources import resource_filename
+
+from dsr import DeepSymbolicOptimizer
+
+
+# Shorter config run for parity test
+CONFIG_TRAINING_OVERRIDE = {
+    "n_samples" : 1000,
+    "batch_size" : 100
+}
+
+
+def main():
+
+    # Train the model
+    model = DeepSymbolicOptimizer("config.json")
+    model.config_training.update(CONFIG_TRAINING_OVERRIDE)
+    model.train()
+
+    # Save the model
+    save_path = resource_filename("dsr.test", "data/test_model")
+    model.save(save_path)
+
+
+if __name__ == "__main__":
+    main()
diff --git a/dsr/dsr/test/test_core.py b/dsr/dsr/test/test_core.py
@@ -0,0 +1,47 @@
+"""Test cases for DeepSymbolicOptimizer on each Task."""
+
+from pkg_resources import resource_filename
+
+import pytest
+import tensorflow as tf
+import numpy as np
+
+from dsr import DeepSymbolicOptimizer
+from dsr.test.generate_test_data import CONFIG_TRAINING_OVERRIDE
+
+
+@pytest.fixture
+def model():
+    return DeepSymbolicOptimizer("config.json")
+
+
+@pytest.fixture
+def cached_results(model):
+    save_path = resource_filename("dsr.test", "data/test_model")
+    model.load(save_path)
+    results = model.sess.run(tf.trainable_variables())
+
+    return results
+
+
+@pytest.mark.parametrize("config", ["config.json"])
+def test_task(model, config):
+    """Test that Tasks do not crash for various configs."""
+
+    model.update_config(config)
+    model.config_training.update({"n_samples" : 10,
+                                  "batch_size" : 5
+                                  })
+    model.train()
+
+
+def test_model_parity(model, cached_results):
+    """Compare results to last"""
+
+    model.config_training.update(CONFIG_TRAINING_OVERRIDE)
+    model.train()
+    results = model.sess.run(tf.trainable_variables())
+
+    cached_results = np.concatenate([a.flatten() for a in cached_results])
+    results = np.concatenate([a.flatten() for a in results])
+    np.testing.assert_array_almost_equal(results, cached_results)
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1,3 @@
		from dsr.core import DeepSymbolicOptimizer
		from dsr.task.regression.sklearn import DeepSymbolicRegressor
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		from dsr.task.task import make_task, set_task, Task