From 4b05014f7a19a85cf78b1ea886ec0130bf174e15 Mon Sep 17 00:00:00 2001
From: "M. Eric Irrgang" <ericirrgang@gmail.com>
Date: Mon, 18 Jun 2018 19:26:36 +0300
Subject: [PATCH] gmxapi-30

Separate example/exhibition notebook from sample workflow walk-through.
Begin elaborating in example.ipynb for demonstration purposes.
---
 examples/example.ipynb                  | 269 +++------------
 examples/walkthrough.ipynb              | 420 ++++++++++++++++++++++++
 examples/{example.py => walkthrough.py} |   0
 3 files changed, 474 insertions(+), 215 deletions(-)
 create mode 100644 examples/walkthrough.ipynb
 rename examples/{example.py => walkthrough.py} (100%)

diff --git a/examples/example.ipynb b/examples/example.ipynb
index e7621bc..6c66550 100644
--- a/examples/example.ipynb
+++ b/examples/example.ipynb
@@ -4,17 +4,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# gmxapi sample workflow using restrained ensemble plugin\n",
-    "\n",
-    "In this notebook, we will walk through a workflow in which we examine a toy system (alanine-dipeptide) with several distinct regions of conformation space, then apply a restrained ensemble biased sampling method to explore the conformational ensemble near the configuration of interest.\n",
-    "\n",
-    "This system is chosen for its low computational cost and well established literature.\n",
-    "\n",
-    "The biased sampling method we will use is follows a restrained ensemble technique that applies a pair restraint between selected atoms to use an (experimentally) observable pair distribution to guide MD sampling. The restraint force is a function of the difference between the target distribution and the simulated ensemble distribution. Our intent is not to promote this biasing technique for this particular system, but rather to simultaneously demonstrate a gmxapi workflow, the gmxapi MD plug-in framework, and one of the example plugin implementations included in the sample_restraint repository. The plugin was developed for simulations requiring tens of thousands of CPU hours, but these examples run in at most a few minutes on a desktop computer.\n",
-    "\n",
-    "The `gmx` Python module is from the gmxapi package. The plugins built with this `sample_restraint` repository are bundled in a package named `myplugin`. While some users may find the restrained ensemble plugin useful, the repository is intended to serve as a template and starting point to develop custom pair restraint potentials. Hopefully, I have removed the least interesting name from the set of possible plugin names, and researchers are encouraged to change the name of the repository and the Python module.\n",
-    "\n",
-    "A note on nomenclature: In Python lingo, `myplugin` is a Python package, a Python module, and a Python C++ extension, but these classifications are not generally equivalent. In this case, the code to calculate forces is written in C++ and built into a shared object library that can be imported into Python. Python objects created with the functions in the package can be passed through gmxapi to allow GROMACS to create local (C++ compiled binary) objects supporting high-performance MD simulation to execute a specified workflow."
+    "# gmxapi Python module demonstration\n",
+    "This notebook illustrates the Python interface for gmxapi with current and planned functionality and syntax.\n",
+    "Additional design aspects are illustrated where possible."
    ]
   },
   {
@@ -23,9 +15,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import sys\n",
-    "import os\n",
-    "import gmx"
+    "import gmx\n",
+    "import myplugin"
    ]
   },
   {
@@ -34,11 +25,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "import numpy\n",
-    "import subprocess\n",
-    "import matplotlib\n",
-    "%matplotlib inline\n",
-    "import matplotlib.pyplot as plt"
+    "# Inline Python documentation extracted from the source code.\n",
+    "help(gmx)"
    ]
   },
   {
@@ -47,20 +35,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# This only works if the gmx binary path was set in the parent process before launching the Jupyter server.\n",
-    "# \\todo Make the docker image use the jovyan user PATH\n",
-    "# def find_program(program): \n",
-    "#     \"\"\"Return the first occurrence of program in PATH or None if not found.\"\"\"\n",
-    "#     for path in os.environ[\"PATH\"].split(os.pathsep):\n",
-    "#         fpath = os.path.join(path, program)\n",
-    "#         if os.path.isfile(fpath) and os.access(fpath, os.X_OK):\n",
-    "#             return fpath\n",
-    "#     return None\n",
-    "# gmx_path = find_program(\"gmx\")\n",
-    "# if gmx_path is None:\n",
-    "#     gmx_path = find_program(\"gmx_mpi\")\n",
-    "# if gmx_path is None:\n",
-    "#     raise UserWarning(\"gmx executable not found in path.\")"
+    "# C++ extension has automatically generated contents and signatures, plus whatever is explicitly added as doc strings.\n",
+    "help(myplugin)"
    ]
   },
   {
@@ -69,17 +45,12 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Get the path to the `gmx` executable associated with the library we linked against so that we can wrap CLI tools not yet in the API.\n",
-    "gmx_path = os.path.join(os.environ['HOME'], 'install/gromacs/bin/gmx')\n",
-    "assert os.access(gmx_path, os.X_OK)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In the following cell, we set the path to the directory where some input files have been stashed.\n",
-    "It is a subdirectory of the `examples` directory and should contain a topology, MD parameters file, and four (previously equilibrated) atomic configurations from the same alanine-dipeptide system for independent trajectories in an ensemble simulation."
+    "# Some test files are bundled with the package\n",
+    "from gmx.data import tpr_filename\n",
+    "\n",
+    "# submodules provide helper functions to create API objects while procedural interface evolves\n",
+    "simulation = gmx.workflow.from_tpr(tpr_filename)\n",
+    "print(simulation)"
    ]
   },
   {
@@ -88,17 +59,13 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Make sure we've got access to the files we expect.\n",
-    "datadir = os.path.abspath('alanine-dipeptide')\n",
-    "workingdir = os.path.basename(datadir)\n",
-    "os.listdir(datadir)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "gmxapi 0.0.5 requires TPR files for input, but does not have an API tool to generate them from MDP files. Wrap the command-line tool to generate run input files for the four simulations."
+    "# The object returned is an element of a complete specification of runnable work.\n",
+    "# (version 0.1.0) WorkElement is a view into a WorkSpec object\n",
+    "# (version 0.0.5) WorkElement has a reference to the WorkSpec it is associated with.\n",
+    "# Though the helper function generates more than one WorkElement, the element associated with the MD simulation\n",
+    "# is the only meaningful thing to return a handle to. With a convention that all elements have an attribute for\n",
+    "# the associated workspec, functions requiring workspec inputs can be much more flexible and intuitive to users.\n",
+    "simulation.workspec"
    ]
   },
   {
@@ -107,22 +74,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Turn input files into runnable binary job input.\n",
-    "for structure in range(4):\n",
-    "    structure_file = os.path.join(datadir, 'equil{}.gro'.format(structure))\n",
-    "    tpr_file = os.path.join(datadir, 'input{}.tpr'.format(structure))\n",
-    "    grompp_args = ['-c', structure_file,\n",
-    "                   '-o', tpr_file,\n",
-    "                   '-f', os.path.join(datadir, 'grompp.mdp'),\n",
-    "                   '-p', os.path.join(datadir, 'topol.top')]\n",
-    "    subprocess.call([gmx_path, \"grompp\"] + grompp_args)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We forumulaically generated input files above. We will load the array of four files into a specification of work. The result is a dependency graph of gmxapi operations that is nominally human-readable, but more importantly serializeable and sufficient to direct the construction of a graph of data flow and lower-level API calls to execute the intended work."
+    "simulation.serialize()"
    ]
   },
   {
@@ -131,23 +83,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "tpr_files = [os.path.join(datadir, 'input{}.tpr'.format(i)) for i in range(4)]\n",
-    "md = gmx.workflow.from_tpr(input=tpr_files, grid=[1,1,1])\n",
-    "\n",
-    "print(\"MD simulation element:\\n\\n{}\".format(md.serialize()))\n",
-    "\n",
-    "print(\"\\nWork specification (pretty printed)\\n\")\n",
-    "print(str(md.workspec))\n",
-    "\n",
-    "print(\"\\nSerialized work specification\\n\")\n",
-    "print(md.workspec.serialize())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "For the initial version of this walk-through, we have not chosen or implemented a way to execute the 4-rank simulation ensemble to perform this work. We can run a single ensemble member (below) or we can resort to a Python script in this same directory. From `sample_restraint/examples`, run `mpiexec -n 4 python -m mpi4py example.py` to run the 4-member ensemble and generate the data for the first Ramachandran plot."
+    "simulation.workspec.serialize()"
    ]
   },
   {
@@ -156,25 +92,18 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# We don't currently have a way of running an array of jobs from a jupyter notebook.\n",
-    "#\n",
-    "#with gmx.context.ParallelArrayContext(md) as session:\n",
-    "#    session.run()"
+    "print(simulation.workspec)"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "tpr_files = [os.path.join(datadir, 'input{}.tpr'.format(i)) for i in range(4)]\n",
-    "md = gmx.workflow.from_tpr(input=[tpr_files[0]], grid=[1,1,1], threads=2, pme_ranks=1, tmpi=2)\n",
+    "The first version of the schema to represent user-requested work has a data structure that is easily serialized with simple grammar. Dependencies of one element on another determines order of processing and allows binding between API objects to be managed at launch. \"gmxapi\" and \"gromacs\" namespaces are reserved for operations provided by the libraries. Other namespaces are assumed to be accessible Python modules.\n",
     "\n",
-    "my_context = gmx.context.ParallelArrayContext(md)\n",
+    "Schema version 0.2 should probably specify a character encoding, but will not have major syntactical differences. Its primary purpose will be to establish tighter constraints on content and more elaborate semantics.\n",
     "\n",
-    "with my_context as session:\n",
-    "    session.run()\n"
+    "The workflow (and subsets thereof) must be uniquely identifiable to support artifact management and optimal restartability."
    ]
   },
   {
@@ -183,19 +112,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Wrap the gmx tool to extract phi and psi values for a Ramachandran diagram. E.g.\n",
-    "# ~/gromacs-mpi/bin/gmx_mpi rama -s topol.tpr -f traj_comp.xtc\n",
-    "def rama(run_input=None, trajectory=None, output=\"rama.xvg\", executable=gmx_path):\n",
-    "    \"\"\"Use the GROMACS tool to extract psi and phi angles for the provided structure and trajectory.\"\"\"\n",
-    "    if run_input is not None and trajectory is not None and output is not None and executable is not None:\n",
-    "        for file_arg in [run_input, trajectory]:\n",
-    "            if not os.path.exists(file_arg):\n",
-    "                raise FileNotFoundError(\"Invalid file: {}\".format(file_arg))\n",
-    "        if not os.access(gmx_path, os.X_OK):\n",
-    "            raise FileExistsError(\"Invalid executable: {}\".format(gmx_path))\n",
-    "    else:\n",
-    "        raise RuntimeError(\"Bad arguments.\")\n",
-    "    subprocess.call([gmx_path, \"rama\", \"-s\", run_input, \"-f\", trajectory, \"-o\", output])"
+    "simulation.workspec.uid()"
    ]
   },
   {
@@ -204,21 +121,13 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "run_input = os.path.join(datadir, 'input0.tpr')\n",
-    "trajectory = os.path.join(my_context.workdir, 'traj_comp.xtc')\n",
-    "trajectory = os.path.join(my_context.workdir, 'traj.trr')\n",
+    "potential = myplugin.HarmonicRestraint()\n",
+    "potential.set_params(1, 4, 2.0, 100.0)\n",
+    "# potential.set_params(1, 4, 0, 0)\n",
     "\n",
-    "rama_file = os.path.join(my_context.workdir, 'rama.xvg')\n",
-    "rama(run_input=run_input, trajectory=trajectory, output=rama_file)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "phi, psi = numpy.genfromtxt(rama_file, skip_header=13, comments='@', usecols=(0,1)).T"
+    "system.add_potential(potential)\n",
+    "with gmx.context.DefaultContext(system.workflow) as session:\n",
+    "    session.run()"
    ]
   },
   {
@@ -227,80 +136,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "fig, ax = plt.subplots(subplot_kw={'aspect': 'equal'})\n",
-    "ax.scatter(phi, psi)\n",
-    "ax.set_xlim(-180, 180)\n",
-    "ax.set_ylim(-180, 180)\n",
-    "ax.set_xlabel('phi')\n",
-    "ax.set_ylabel('psi')\n",
-    "ax.set_title('Alanine dipeptide Ramachandran plot')"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "plt.plot(phi, '.', label='phi')\n",
-    "plt.plot(psi, '.', label='psi')\n",
-    "plt.legend()\n",
-    "plt.title(\"Evolution of phi and psi\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import MDAnalysis\n",
-    "import MDAnalysis.analysis.distances"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "u = MDAnalysis.Universe(os.path.join(datadir, 'equil0.gro'), trajectory)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "print(u.residues)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "alanine = u.select_atoms(\"resname ACE ALA NME\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "alanine"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "trajs = numpy.array([MDAnalysis.analysis.distances.self_distance_array(alanine.positions) for _ in u.trajectory])"
+    "md = gmx.workflow.from_tpr([tpr_filename, tpr_filename])\n",
+    "print(md.workspec)"
    ]
   },
   {
@@ -309,7 +146,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "trajs.shape"
+    "print(md.serialize())"
    ]
   },
   {
@@ -318,9 +155,11 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "for traj in trajs:\n",
-    "    plt.plot(traj)\n",
-    "plt.ylim(0,10)"
+    "potential = gmx.workflow.WorkElement(namespace=\"myplugin\",\n",
+    "                                     operation=\"create_restraint\",\n",
+    "                                     params=[1, 4, 2.0, 10000.0])\n",
+    "potential.name = \"harmonic_restraint\"\n",
+    "md.add_dependency(potential)"
    ]
   },
   {
@@ -329,11 +168,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "potential = gmx.workflow.WorkElement(namespace=\"myplugin\",\n",
-    "                                     operation=\"ensemble_restraint\",\n",
-    "                                     params=[1, 4, 2.0, 10000.0])\n",
-    "potential.name = \"restrained_ensemble\"\n",
-    "md.add_dependency(potential)"
+    "print(md.serialize())"
    ]
   },
   {
@@ -342,7 +177,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "print(md.serialize())"
+    "print(md.workspec)"
    ]
   },
   {
@@ -351,7 +186,11 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "print(md.workspec)"
+    "context = gmx.context.ParallelArrayContext(md)\n",
+    "with context as session:\n",
+    "    if context.rank == 0:\n",
+    "        print(context.work)\n",
+    "    session.run()"
    ]
   },
   {
@@ -405,14 +244,14 @@
   "language_info": {
    "codemirror_mode": {
     "name": "ipython",
-    "version": 3
+    "version": 2
    },
    "file_extension": ".py",
    "mimetype": "text/x-python",
    "name": "python",
    "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.6.3"
+   "pygments_lexer": "ipython2",
+   "version": "2.7.13"
   }
  },
  "nbformat": 4,
diff --git a/examples/walkthrough.ipynb b/examples/walkthrough.ipynb
new file mode 100644
index 0000000..e7621bc
--- /dev/null
+++ b/examples/walkthrough.ipynb
@@ -0,0 +1,420 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# gmxapi sample workflow using restrained ensemble plugin\n",
+    "\n",
+    "In this notebook, we will walk through a workflow in which we examine a toy system (alanine-dipeptide) with several distinct regions of conformation space, then apply a restrained ensemble biased sampling method to explore the conformational ensemble near the configuration of interest.\n",
+    "\n",
+    "This system is chosen for its low computational cost and well established literature.\n",
+    "\n",
+    "The biased sampling method we will use is follows a restrained ensemble technique that applies a pair restraint between selected atoms to use an (experimentally) observable pair distribution to guide MD sampling. The restraint force is a function of the difference between the target distribution and the simulated ensemble distribution. Our intent is not to promote this biasing technique for this particular system, but rather to simultaneously demonstrate a gmxapi workflow, the gmxapi MD plug-in framework, and one of the example plugin implementations included in the sample_restraint repository. The plugin was developed for simulations requiring tens of thousands of CPU hours, but these examples run in at most a few minutes on a desktop computer.\n",
+    "\n",
+    "The `gmx` Python module is from the gmxapi package. The plugins built with this `sample_restraint` repository are bundled in a package named `myplugin`. While some users may find the restrained ensemble plugin useful, the repository is intended to serve as a template and starting point to develop custom pair restraint potentials. Hopefully, I have removed the least interesting name from the set of possible plugin names, and researchers are encouraged to change the name of the repository and the Python module.\n",
+    "\n",
+    "A note on nomenclature: In Python lingo, `myplugin` is a Python package, a Python module, and a Python C++ extension, but these classifications are not generally equivalent. In this case, the code to calculate forces is written in C++ and built into a shared object library that can be imported into Python. Python objects created with the functions in the package can be passed through gmxapi to allow GROMACS to create local (C++ compiled binary) objects supporting high-performance MD simulation to execute a specified workflow."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sys\n",
+    "import os\n",
+    "import gmx"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy\n",
+    "import subprocess\n",
+    "import matplotlib\n",
+    "%matplotlib inline\n",
+    "import matplotlib.pyplot as plt"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# This only works if the gmx binary path was set in the parent process before launching the Jupyter server.\n",
+    "# \\todo Make the docker image use the jovyan user PATH\n",
+    "# def find_program(program): \n",
+    "#     \"\"\"Return the first occurrence of program in PATH or None if not found.\"\"\"\n",
+    "#     for path in os.environ[\"PATH\"].split(os.pathsep):\n",
+    "#         fpath = os.path.join(path, program)\n",
+    "#         if os.path.isfile(fpath) and os.access(fpath, os.X_OK):\n",
+    "#             return fpath\n",
+    "#     return None\n",
+    "# gmx_path = find_program(\"gmx\")\n",
+    "# if gmx_path is None:\n",
+    "#     gmx_path = find_program(\"gmx_mpi\")\n",
+    "# if gmx_path is None:\n",
+    "#     raise UserWarning(\"gmx executable not found in path.\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Get the path to the `gmx` executable associated with the library we linked against so that we can wrap CLI tools not yet in the API.\n",
+    "gmx_path = os.path.join(os.environ['HOME'], 'install/gromacs/bin/gmx')\n",
+    "assert os.access(gmx_path, os.X_OK)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the following cell, we set the path to the directory where some input files have been stashed.\n",
+    "It is a subdirectory of the `examples` directory and should contain a topology, MD parameters file, and four (previously equilibrated) atomic configurations from the same alanine-dipeptide system for independent trajectories in an ensemble simulation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Make sure we've got access to the files we expect.\n",
+    "datadir = os.path.abspath('alanine-dipeptide')\n",
+    "workingdir = os.path.basename(datadir)\n",
+    "os.listdir(datadir)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "gmxapi 0.0.5 requires TPR files for input, but does not have an API tool to generate them from MDP files. Wrap the command-line tool to generate run input files for the four simulations."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Turn input files into runnable binary job input.\n",
+    "for structure in range(4):\n",
+    "    structure_file = os.path.join(datadir, 'equil{}.gro'.format(structure))\n",
+    "    tpr_file = os.path.join(datadir, 'input{}.tpr'.format(structure))\n",
+    "    grompp_args = ['-c', structure_file,\n",
+    "                   '-o', tpr_file,\n",
+    "                   '-f', os.path.join(datadir, 'grompp.mdp'),\n",
+    "                   '-p', os.path.join(datadir, 'topol.top')]\n",
+    "    subprocess.call([gmx_path, \"grompp\"] + grompp_args)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We forumulaically generated input files above. We will load the array of four files into a specification of work. The result is a dependency graph of gmxapi operations that is nominally human-readable, but more importantly serializeable and sufficient to direct the construction of a graph of data flow and lower-level API calls to execute the intended work."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tpr_files = [os.path.join(datadir, 'input{}.tpr'.format(i)) for i in range(4)]\n",
+    "md = gmx.workflow.from_tpr(input=tpr_files, grid=[1,1,1])\n",
+    "\n",
+    "print(\"MD simulation element:\\n\\n{}\".format(md.serialize()))\n",
+    "\n",
+    "print(\"\\nWork specification (pretty printed)\\n\")\n",
+    "print(str(md.workspec))\n",
+    "\n",
+    "print(\"\\nSerialized work specification\\n\")\n",
+    "print(md.workspec.serialize())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For the initial version of this walk-through, we have not chosen or implemented a way to execute the 4-rank simulation ensemble to perform this work. We can run a single ensemble member (below) or we can resort to a Python script in this same directory. From `sample_restraint/examples`, run `mpiexec -n 4 python -m mpi4py example.py` to run the 4-member ensemble and generate the data for the first Ramachandran plot."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We don't currently have a way of running an array of jobs from a jupyter notebook.\n",
+    "#\n",
+    "#with gmx.context.ParallelArrayContext(md) as session:\n",
+    "#    session.run()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tpr_files = [os.path.join(datadir, 'input{}.tpr'.format(i)) for i in range(4)]\n",
+    "md = gmx.workflow.from_tpr(input=[tpr_files[0]], grid=[1,1,1], threads=2, pme_ranks=1, tmpi=2)\n",
+    "\n",
+    "my_context = gmx.context.ParallelArrayContext(md)\n",
+    "\n",
+    "with my_context as session:\n",
+    "    session.run()\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Wrap the gmx tool to extract phi and psi values for a Ramachandran diagram. E.g.\n",
+    "# ~/gromacs-mpi/bin/gmx_mpi rama -s topol.tpr -f traj_comp.xtc\n",
+    "def rama(run_input=None, trajectory=None, output=\"rama.xvg\", executable=gmx_path):\n",
+    "    \"\"\"Use the GROMACS tool to extract psi and phi angles for the provided structure and trajectory.\"\"\"\n",
+    "    if run_input is not None and trajectory is not None and output is not None and executable is not None:\n",
+    "        for file_arg in [run_input, trajectory]:\n",
+    "            if not os.path.exists(file_arg):\n",
+    "                raise FileNotFoundError(\"Invalid file: {}\".format(file_arg))\n",
+    "        if not os.access(gmx_path, os.X_OK):\n",
+    "            raise FileExistsError(\"Invalid executable: {}\".format(gmx_path))\n",
+    "    else:\n",
+    "        raise RuntimeError(\"Bad arguments.\")\n",
+    "    subprocess.call([gmx_path, \"rama\", \"-s\", run_input, \"-f\", trajectory, \"-o\", output])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "run_input = os.path.join(datadir, 'input0.tpr')\n",
+    "trajectory = os.path.join(my_context.workdir, 'traj_comp.xtc')\n",
+    "trajectory = os.path.join(my_context.workdir, 'traj.trr')\n",
+    "\n",
+    "rama_file = os.path.join(my_context.workdir, 'rama.xvg')\n",
+    "rama(run_input=run_input, trajectory=trajectory, output=rama_file)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "phi, psi = numpy.genfromtxt(rama_file, skip_header=13, comments='@', usecols=(0,1)).T"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "fig, ax = plt.subplots(subplot_kw={'aspect': 'equal'})\n",
+    "ax.scatter(phi, psi)\n",
+    "ax.set_xlim(-180, 180)\n",
+    "ax.set_ylim(-180, 180)\n",
+    "ax.set_xlabel('phi')\n",
+    "ax.set_ylabel('psi')\n",
+    "ax.set_title('Alanine dipeptide Ramachandran plot')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "plt.plot(phi, '.', label='phi')\n",
+    "plt.plot(psi, '.', label='psi')\n",
+    "plt.legend()\n",
+    "plt.title(\"Evolution of phi and psi\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import MDAnalysis\n",
+    "import MDAnalysis.analysis.distances"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "u = MDAnalysis.Universe(os.path.join(datadir, 'equil0.gro'), trajectory)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(u.residues)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "alanine = u.select_atoms(\"resname ACE ALA NME\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "alanine"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "trajs = numpy.array([MDAnalysis.analysis.distances.self_distance_array(alanine.positions) for _ in u.trajectory])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "trajs.shape"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for traj in trajs:\n",
+    "    plt.plot(traj)\n",
+    "plt.ylim(0,10)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "potential = gmx.workflow.WorkElement(namespace=\"myplugin\",\n",
+    "                                     operation=\"ensemble_restraint\",\n",
+    "                                     params=[1, 4, 2.0, 10000.0])\n",
+    "potential.name = \"restrained_ensemble\"\n",
+    "md.add_dependency(potential)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(md.serialize())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(md.workspec)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "md = gmx.workflow.from_tpr(tpr_filename)\n",
+    "md.add_dependency(potential)\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "potential.workspec = None\n",
+    "md.add_dependency(potential)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "context = gmx.context.ParallelArrayContext(md)\n",
+    "with context as session:\n",
+    "    if context.rank == 0:\n",
+    "        print(context.work)\n",
+    "    status = session.run()\n",
+    "print(status)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
\ No newline at end of file
diff --git a/examples/example.py b/examples/walkthrough.py
similarity index 100%
rename from examples/example.py
rename to examples/walkthrough.py