-
Notifications
You must be signed in to change notification settings - Fork 10
Howto setup an IPython cluster
This is a wiki page for the PyBroMo software
This is a quick howto on the setup of an IPython cluster. For more info see the official IPython docs: Using IPython for parallel computing.
Before starting you need to install IPython. The easiest way is to get it through a scientific python distribution, like Anaconda.
Launch the notebook server and, from the cluster tab, start 4 engines.
Open a terminal (cmd.exe) and type:
ipcluster start -n 4
Reference from IPython docs:
Here we configure 2 machines, one controller-host that launch the simulation and one slave-host that performs the computation. This procedure can be extended to multiple "slave" machine just repeating this same configuration.
NOTE for Windows: All the commands must be pasted in a cmd.exe terminal.
Only the first time we need to create an IPython profile.
ipython profile create --parallel --profile=parallel
This command copies a new set of configuration files in
IPYTHONDIR/profile_parallel
, where IPYTHONDIR is usually a folder named
.ipython in the user home folder (C:\Users\username\
). These files can be
customized to change the default behavior, if needed.
Now, each time we want to start a parallel computation we begin starting the controller:
ipcontroller --profile=parallel --ip=169.232.130.141
where the address is the controller ip address.
This command creates a file ipcontroller-engine.json
that contains
the connection info that the other machines need in order to connect to the
controller.
The file is located in IPYTHONDIR/profile_parallel/security
.
We need to copy ipcontroller-engine.json
to the computation machine.
To automate this step I like to link the IPython folder into a Dropbox folder
so that all the configuration files are automatically copied/updated on
the different machines.
Also on the machine in which we run the computation it's useful to create a profile (only the first time), with the same command as before:
ipython profile create --parallel --profile=parallel
A new set of configuration files is created in
IPYTHONDIR/profile_parallel
.
We can start a computation engine with the ipengine
command, specifying the
path of the ipcontroller-engine.json
file:
ipengine --profile=parallel --file=C:\Data\user\software\Dropbox\ipython\profile_parallel\security\ipcontroller-engine.json
or, we can write the file name in the configuration file so we don't need
to write it every time. To do so, edit the file ipengine_config.py
found in the previously created profile folder (IPYTHONDIR/profile_parallel
).
Find the line:
#c.IPEngineApp.url_file = u''
remove the trailing #
and write the ipcontroller-engine.json
path, in our
example:
c.IPEngineApp.url_file = u'C:\Data\user\software\Dropbox\ipython\profile_parallel\security\ipcontroller-engine.json'
Now to launch an engine simply type:
ipengine --profile=parallel
It is suggested to launch as many engine as the number of cores. To launch a second engine open a new terminal and type again the command, and so on.
To add another machine for computation just repeat the previous steps.
Once the cluster is started (either in a single machine or on multiple machines) we are ready to launch a simulation.
On the controller machine start an IPython QtConsole or an
IPython notebook using the profile parallel
:
ipython qtconsole --profile=parallel
or
ipython notebook --profile=parallel
Then do:
from IPython.parallel import Client
rc = Client()
rc.ids
the last command should print the number of engines that were started.
Alternatively, if you have a QtConsole or Notebook already started without the profile parallel, you can simply specify the path of the file that contains the clients (engines) information. This file is ipcontroller-client.json (not -engines as before!) and is located in the profile folder.
NOTE: This trick is used by the PyBroMo notebooks so you don't need to restart the notebook server after you launch the cluster.