Distob will take your existing python objects, or a sequence of objects, and scatter them onto many IPython parallel engines, which may be running on a single computer or on a cluster.
In place of the original objects, proxy objects are kept on the client computer that provide the same interface as the original objects. You can continue to use these as if the objects were still local. All methods are passed through to the remote objects, where computation is done.
In particular, sending numpy arrays to the cluster is supported.
A numpy array can also be scattered across the cluster, along a particular axis. Operations on the array can then be automatically done in parallel (either using ufuncs, or by using vectorize()
below)
Note: numpy with __numpy_ufunc__
feature enabled (not yet released) is required to support distributed array arithmetic and distributed ufuncs. You can get numpy with this experimental feature enabled here: https://github.com/mattja/numpy/archive/master.zip
Distob is an object layer built on top of ipyparallel
, so it will
make use of your default IPython parallel profile. This allows different
cluster architectures, local CPUs, SSH nodes, PBS, Amazon EC2, etc.
scatter(obj)
Distribute any object (or list of objects) to remote iPython engines, return a proxy.gather(obj)
Fetch back a distributed object (or list), making it local again.vectorize(f)
Turn an ordinary function (that takes a single object or array) into one that acts in parallel on a scattered list or array. apply(f, obj)
is the same as vectorize(f)(obj)
scatter(a, axis=2)
Distribute a single numpy array along axis 2, returning a DistArray.__numpy_ufunc__
feature enabled)concatenate
, vstack
, hstack
, dstack
, expand_dims
, transpose
, rollaxis
, split
, vsplit
, hsplit
, dsplit
, broadcast_arrays
:RemoteArray
proxy object representing a remote numpy ndarrayDistArray
a single ndarray distributed across multiple enginesRemote
base class, used when auto-creating Remote*
proxy classes@proxy_methods(base)
class decorator for auto-creating Remote*
proxy classesObjectHub
dict interface giving refs to all distributed objects cluster-wideObjectEngine
dict holding the distributed objects of a single IPython engineRef
reference to a (possibly remote) objectengine
: the ObjectEngine
instance on each host (ObjectHub
on
the client)
- Allow assignment to slices of remote arrays.
- Properly implement caching of remote method results.
- Auto-creation of proxy classes at runtime (depends uqfoundation/dill#58)
- For ufunc execution, still need to implement
reduce
,accumulate
,reduceat
,outer
,at
methods. - Make proxy classes more robust, adapting
wrapt
(pypi.python.org/pypi/wrapt)
Incorporates pylru.py
by Jay Hutchinson,
http://github.com/jlhutch/pylru
ipyparallel
interactive parallel computing:
https://ipyparallel.readthedocs.org/
dill
by Mike McKerns for object serialization, see:
http://trac.mystic.cacr.caltech.edu/project/pathos