-
Notifications
You must be signed in to change notification settings - Fork 0
SPP Code Examples
The online resource has small example programs covering scans, sorts (merge, quick, sample), pipeline, K-Means Clustering, and Cholesky Factorization in TBB, Cilk Plus, and C++. All can be downloaded as a zip file and instructions are available for all platforms, though I wasn't able to test them myself.
http://parallelbook.com/instructor OR http://parallelbook.com/student have the same links.
*This page is built upon version 3 of the code time stamped on November 13, 2013.
This section lists prerequisite software required to build and run the examples.
To compile all of the examples requires a compiler supporting:
- Cilk Plus
- OpenMP 3.0 or later
- C++11 lambda expressions
Currently supported compilers are:
Compiler | Comment |
---|---|
Intel C++ compiler, version 13.0 or later | Version 13.0 or later can compile all of the examples. |
g++ 4.5.1 or later | Cannot compile Cilk Plus examples. |
gcc 4.7 branch with Cilk Plus | Should be able to compile all of the examples. |
Visual Studio 2010 or later | Can compile only the TBB examples. |
Conditional compilation directives will automatically omit code that these compilers cannot handle.
You need Intel Threading Building Blocks (TBB), because the examples use TBB's portable tbb::tick_count
to do timing. TBB is distributed with the Intel compiler, and or open-source version can be downloaded. The makefiles assume your environment is set up so that your compiler can find the TBB include files and run-time library. On Linux and MacOS, you can do this setup by running the script tbb/bin/tbbvars.sh or sourcing the script tbb/bin/tbbvars.csh, depending on which shell you are using. Those scripts are part of TBB and not part of this package.
The Cholesky Factorization example requires a Basic Linear Algebra Subroutines (BLAS) library.
Linux and Mac OS systems typically have the other prerequisites. Windows users may have to acquire a few extra tools:
GNU make | GNU make is required to build the examples. |
Python | Required to run the top-level Makefile that builds all the examples. Python is not required for building individual examples. |
Makefiles are provided. They should work on Linux, Mac OS, or Windows. They include config/config.inc, which describes the compiler options to be used. Near the top of this file is a choice of lines that set CPLUS. Uncomment the one that matches your compiler.
The top level Makefile in this directory does the same action for all examples. The supported actions are:
Command | Effect |
---|---|
make run | Run all examples. Builds any examples that are not yet built. |
make build | Build all examples |
make clean | Remove all binaries and object files |
These commands can be applied to a single example, say foo, by:
- cd to src/foo/*/build/
- Run make there.
src/ | Source files |
tools/ | Python script for running all of the Makefiles |
config/config.inc | Configuration file inherited by Makefiles |
The Listing numbers match those in the book.
Listings | Summary |
---|---|
5.16 | Scan implemented in OpenMP |
8.9-8.13 | Quicksort |
8.17-8.20 | Scan and pack patterns implemented using fork-join |
9.1-9.4 | Pipeline examples |
10.1,10.3,10.42 | Forward Seismic Simulation |
11.1-11.8 | K-Means Clustering |
13.1-13.4 | Merge Sort |
14.1-14.4 | Sample Sort |
15.1-15.3 | Cholesky Factorization |
The subdirectories herein contain implementations of the scan pattern from Chapters 5 and 8 of the book.
Listing | File | Summary |
---|---|---|
5.14 | serial/inclusive_scan.h | Serial implementation of inclusive scan. |
5.15 | serial/exclusive_scan.h | Serial implementation of exclusive scan. |
5.16 | openmp/openmp_scan.h | Serial implementation of exclusive scan. |
8.17 | cilkplus/cilk_scan.h | Top-level code for tiled parallel scan. |
8.18 | cilkplus/upsweep.h | Upsweep phase for tiled parallel scan in Cilk Plus. |
8.19 | cilkplus/downsweep.h | Downsweep phase for tiled parallel scan in Cilk Plus. |
8.20 | cilkplus/pack.h | Implementing pack pattern with cilk_scan from Listing 8.17. |
The subdirectories herein contain the sorts from the book and a harness (common/test_sort.cpp) that both tests and times them.
Listing | File | Summary |
---|---|---|
8.9 | common/quicksort_util.h | Code shared by Quicksort implementations. |
8.10 | cilkplus/quicksort_cilk_recursive.h | Fully recursive parallel Quicksort using Cilk Plus. |
8.11 | cilkplus/quicksort_cilk_semirecursive.h | Semi-recursive parallel Quicksort using Cilk Plus. |
8.12 | tbb/quicksort_tbb_taskgroup.h | Semi-recursive parallel Quicksort using TBB. |
8.13 | tbb/quicksort_tbb_task.h | Quicksort in TBB that achieves Cilk Plus space guarantees for Listing 8.11. |
Listing | File | Summary |
---|---|---|
13.1 | serial/serial_merge.h | Serial merge. |
13.2 | cilkplus/merge_cilk.h | Parallel merge in Cilk Plus. |
13.3 | tbb/merge_tbb.h | Parallel merge in TBB. This is the entire algorithm in TBB. The book shows only the lines that differ from the Cilk Plus version. |
13.4 | cilkplus/merge_sort_cilk.h | Parallel merge sort in Cilk Plus. |
Listing | File | Summary |
---|---|---|
14.1 | common/sample_sort.h | Top-level code for parallel sample sort. |
14.2 | common/map_keys_to_bins_cilk.h | Code for mapping keys to bins in Cilk Plus. |
14.3 | cilkplus/bin_cilk.h | Parallel binning of keys using Cilk Plus. |
14.4 | cilkplus/repack_and_subsort_cilk.h | Repacking and subsorting using Cilk Plus. |
14.5 | common/destructive_move.h | Destructive move in Cilk Plus. |
The subdirectories herein contain Pipeline examples from Chapter 9 of the book.
Listing | File | Summary |
---|---|---|
9.1 | serial/serial_pipeline.h | Serial implementation of a pipeline, with stages f, g, and h. |
9.2 | tbb/tbb_sps_pipeline.h | Pipeline in TBB. Equivalent to Listing 9.1, except that the stages run in parallel and the middle stage processes multiple items in parallel. |
9.3 | cilkplus/cilk_sps_pipeline.h | Pipeline in Cilk Plus equivalent to the serial pipeline in Listing 9.1. |
9.4 | cilkpus/reducer_consume.h | Defining a reducer for serializing consumption of items in Cilk Plus. |
The subdirectories herein contain seismic forward simulation code from Chapter 10 of the book.
Listing | File | Summary |
---|---|---|
10.1 | serial/serial_stencil.h | Serial code for simulating wavefield. |
10.3 | cilkplus/base_trapezoid.h | Base case for applying stencil to space-time trapezoid. |
10.4 | cilkplus/recursive_trapezoid.h | Parallel cache-oblivious trapezoid decomposition in Cilk Plus. |
The subdirectories herein contain the K-Means Clustering examples from Chapter 11 of the book.
Listing | File | Summary |
---|---|---|
11.1 | cilkplus/kmeans_cilk.h | K-means clustering in Cilk Plus. |
11.2 | common/sum_and_count.h | Type sum_and_count for computing mean of points in a cluster. |
11.3 | cilkplus/elementwise_reducer.h | Defining a hyperobject for summing an array elementwise in Cilk Plus. |
11.4 | tbb/view.h | Declaring type tls_type for thread-local views in TBB. |
11.5 | tbb/reduce_local_sums_to_global_sum.h | Walking local views to detect changes. |
11.6 | tbb/reduce_local_counts_to_global_count.h | Walking local views to accumulate a global sum. |
11.7 | tbb/reduce_min_ind.h | Routine for finding index of centroid closest to a given point. |
11.8 | tbb/kmeans_tbb.h | K-means clustering in TBB. |
The subdirectories herein contain seismic forward simulation code from Chapter 15 of the book.
The code assumes you have the Intel Math Kernel Library (MKL) implementation of the BLAS. You will need to set your paths so the compiler can find the MKL headers and libraries.
If you have a different implementation the BLAS, you will need to edit cilkplus/blas_leaf.h as follows:
Replace #include "mkl.h" with the header for your BLAS.
Perhaps edit the wrappers if your BLAS does not have the usual C interface.
Listing | File | Summary |
---|---|---|
15.1 | cilkplus/potrf_cilk.h | Recursive Cholesky decomposition. |
15.2 | cilkplus/trsm_cilk.h | Parallel triangular solve in Cilk Plus. |
15.3 | cilkplus/syrk_cilk.h | Parallel symmetric rank update in Cilk Plus. |