______ _ ┌─────────┐┌──────┐
| ___ \ | | │ Task A ││Task B│
| |_/ /_ _ _ __| | __ _ └┬───────┬┘└────┬─┘
| __/ _` | '__| |/ _` | ┌▽─────┐┌▽─────┐│
| | | (_| | | | | (_| | │Task D││Task C││
\_| \__,_|_| |_|\__,_| └┬─────┘└┬─────┘│
┌▽─────┐┌▽──────▽┐
└──────┘└────────┘
Parla is a task-parallel programming library for Python. Parla targets the orchestration of heterogeneous (CPU+GPU) workloads on a single shared-memory machine. We provide features for resource management, task variants, and automated scheduling of data movement between devices.
We design for gradual-adoption allowing users to easily port sequential code for parallel execution.
The Parla runtime is multi-threaded but single-process to utilize a shared address space. In practice this means that the main compute workload within each task must release the CPython Global Interpreter Lock (GIL) to achieve parallel speedup.
Note: Parla is not designed with workflow management in mind and does not currently support features for fault-tolerance or checkpointing.
Parla is currently distributed from this repository as a Python module.
Parla 0.2 requires Python>=3.7
, numpy
, cupy
, and psutil
and can be installed as follows:
conda (or pip) install -c conda-forge numpy cupy psutil
git clone https://github.com/ut-parla/Parla.py.git
cd Parla.py
pip install .
To test your installation, try running
python tutorial/0_hello_world/hello.py
This should print
Hello, World!
We recommend working through the tutorial as a starting point for learning Parla!
Parla tasks are launched in an indexed namespace (the 'TaskSpace') and capture variables from the local scope through the task body's closure.
Basic usage can be seen below:
with Parla:
T = TaskSpace("Example Space")
for i in range(4):
@spawn(T[i], placement=cpu)
def tasks_A():
print(f"We run first on the CPU. I am task {i}", flush=True)
@spawn(T[4], dependencies=[T[0:4]], placement=gpu)
def task_B():
print("I run second on any GPU", flush=True)
The examples have a wider set of dependencies.
Running all requires: scipy
, numba
, pexpect
, mkl
, mkl-service
, and Cython
.
To get the full set of examples (BLR, N-Body, and synthetic graphs) initialize the submodules:
git submodule update --init --recursive --remote
Specific running installation instructions for each of these submodules can be found in their directories.
The test-suite over them (reproducing the results in the SC'22 Paper) can be launched as:
python examples/launcher.py --figures <list of figures to reproduce>
This software is based upon work supported by the Department of Energy, National Nuclear Security Administration under Award Number DE-NA0003969.
Please cite the following reference.
@inproceedings{
author = {H. Lee, W. Ruys, Y. Yan, S. Stephens, B. You, H. Fingler, I. Henriksen, A. Peters, M. Burtscher, M. Gligoric, K. Schulz, K. Pingali, C. J. Rossbach, M. Erez, and G. Biros},
title = {Parla: A Python Orchestration System for Heterogeneous Architectures},
year = {2022},
booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis},
series = {SC'22}
}