Skip to content
Stephanie edited this page Jan 21, 2014 · 7 revisions

Summary

The online resource has small example programs covering scans, sorts (merge, quick, sample), pipeline, K-Means Clustering, and Cholesky Factorization in TBB, Cilk Plus, and C++. All can be downloaded as a zip file and instructions are available for all platforms, though I wasn't able to test them myself.


Download

http://parallelbook.com/instructor OR http://parallelbook.com/student have the same links.
*This page is built upon version 3 of the code time stamped on November 13, 2013.


Prerequisites

This section lists prerequisite software required to build and run the examples.

Compiler

To compile all of the examples requires a compiler supporting:

  • Cilk Plus
  • OpenMP 3.0 or later
  • C++11 lambda expressions

Currently supported compilers are:

Compiler Comment
Intel C++ compiler, version 13.0 or later Version 13.0 or later can compile all of the examples.
g++ 4.5.1 or later Cannot compile Cilk Plus examples.
gcc 4.7 branch with Cilk Plus Should be able to compile all of the examples.
Visual Studio 2010 or later Can compile only the TBB examples.

Conditional compilation directives will automatically omit code that these compilers cannot handle.

Libraries

You need Intel Threading Building Blocks (TBB), because the examples use TBB's portable tbb::tick_count to do timing. TBB is distributed with the Intel compiler, and or open-source version can be downloaded. The makefiles assume your environment is set up so that your compiler can find the TBB include files and run-time library. On Linux and MacOS, you can do this setup by running the script tbb/bin/tbbvars.sh or sourcing the script tbb/bin/tbbvars.csh, depending on which shell you are using. Those scripts are part of TBB and not part of this package.
The Cholesky Factorization example requires a Basic Linear Algebra Subroutines (BLAS) library.

Other Tools

Linux and Mac OS systems typically have the other prerequisites. Windows users may have to acquire a few extra tools:

GNU make GNU make is required to build the examples.
Python Required to run the top-level Makefile that builds all the examples. Python is not required for building individual examples.

Building and Running

Using Makefiles

Makefiles are provided. They should work on Linux, Mac OS, or Windows. They include config/config.inc, which describes the compiler options to be used. Near the top of this file is a choice of lines that set CPLUS. Uncomment the one that matches your compiler.
The top level Makefile in this directory does the same action for all examples. The supported actions are:

Command Effect
make run Run all examples. Builds any examples that are not yet built.
make build Build all examples
make clean Remove all binaries and object files

These commands can be applied to a single example, say foo, by:

  1. cd to src/foo/*/build/
  2. Run make there.

Files and Subdirectories

src/ Source files
tools/ Python script for running all of the Makefiles
config/config.inc Configuration file inherited by Makefiles

Inventory of Programs

The Listing numbers match those in the book.

Listings Summary
5.16 Scan implemented in OpenMP
8.9-8.13 Quicksort
8.17-8.20 Scan and pack patterns implemented using fork-join
9.1-9.4 Pipeline examples
10.1,10.3,10.42 Forward Seismic Simulation
11.1-11.8 K-Means Clustering
13.1-13.4 Merge Sort
14.1-14.4 Sample Sort
15.1-15.3 Cholesky Factorization

Detailed Inventory

Scan

The subdirectories herein contain implementations of the scan pattern from Chapters 5 and 8 of the book.

Listing File Summary
5.14 serial/inclusive_scan.h Serial implementation of inclusive scan.
5.15 serial/exclusive_scan.h Serial implementation of exclusive scan.
5.16 openmp/openmp_scan.h Serial implementation of exclusive scan.
8.17 cilkplus/cilk_scan.h Top-level code for tiled parallel scan.
8.18 cilkplus/upsweep.h Upsweep phase for tiled parallel scan in Cilk Plus.
8.19 cilkplus/downsweep.h Downsweep phase for tiled parallel scan in Cilk Plus.
8.20 cilkplus/pack.h Implementing pack pattern with cilk_scan from Listing 8.17.

Sorts

The subdirectories herein contain the sorts from the book and a harness (common/test_sort.cpp) that both tests and times them.

Quicksort
Listing File Summary
8.9 common/quicksort_util.h Code shared by Quicksort implementations.
8.10 cilkplus/quicksort_cilk_recursive.h Fully recursive parallel Quicksort using Cilk Plus.
8.11 cilkplus/quicksort_cilk_semirecursive.h Semi-recursive parallel Quicksort using Cilk Plus.
8.12 tbb/quicksort_tbb_taskgroup.h Semi-recursive parallel Quicksort using TBB.
8.13 tbb/quicksort_tbb_task.h Quicksort in TBB that achieves Cilk Plus space guarantees for Listing 8.11.
Merge Sort
Listing File Summary
13.1 serial/serial_merge.h Serial merge.
13.2 cilkplus/merge_cilk.h Parallel merge in Cilk Plus.
13.3 tbb/merge_tbb.h Parallel merge in TBB. This is the entire algorithm in TBB. The book shows only the lines that differ from the Cilk Plus version.
13.4 cilkplus/merge_sort_cilk.h Parallel merge sort in Cilk Plus.
Sample Sort
Listing File Summary
14.1 common/sample_sort.h Top-level code for parallel sample sort.
14.2 common/map_keys_to_bins_cilk.h Code for mapping keys to bins in Cilk Plus.
14.3 cilkplus/bin_cilk.h Parallel binning of keys using Cilk Plus.
14.4 cilkplus/repack_and_subsort_cilk.h Repacking and subsorting using Cilk Plus.
14.5 common/destructive_move.h Destructive move in Cilk Plus.

Pipeline

The subdirectories herein contain Pipeline examples from Chapter 9 of the book.

Listing File Summary
9.1 serial/serial_pipeline.h Serial implementation of a pipeline, with stages f, g, and h.
9.2 tbb/tbb_sps_pipeline.h Pipeline in TBB. Equivalent to Listing 9.1, except that the stages run in parallel and the middle stage processes multiple items in parallel.
9.3 cilkplus/cilk_sps_pipeline.h Pipeline in Cilk Plus equivalent to the serial pipeline in Listing 9.1.
9.4 cilkpus/reducer_consume.h Defining a reducer for serializing consumption of items in Cilk Plus.

Seismic Forward Simulation

The subdirectories herein contain seismic forward simulation code from Chapter 10 of the book.

Listing File Summary
10.1 serial/serial_stencil.h Serial code for simulating wavefield.
10.3 cilkplus/base_trapezoid.h Base case for applying stencil to space-time trapezoid.
10.4 cilkplus/recursive_trapezoid.h Parallel cache-oblivious trapezoid decomposition in Cilk Plus.

K-Means Clustering

The subdirectories herein contain the K-Means Clustering examples from Chapter 11 of the book.

Listing File Summary
11.1 cilkplus/kmeans_cilk.h K-means clustering in Cilk Plus.
11.2 common/sum_and_count.h Type sum_and_count for computing mean of points in a cluster.
11.3 cilkplus/elementwise_reducer.h Defining a hyperobject for summing an array elementwise in Cilk Plus.
11.4 tbb/view.h Declaring type tls_type for thread-local views in TBB.
11.5 tbb/reduce_local_sums_to_global_sum.h Walking local views to detect changes.
11.6 tbb/reduce_local_counts_to_global_count.h Walking local views to accumulate a global sum.
11.7 tbb/reduce_min_ind.h Routine for finding index of centroid closest to a given point.
11.8 tbb/kmeans_tbb.h K-means clustering in TBB.

Cholesky Factorization

The subdirectories herein contain seismic forward simulation code from Chapter 15 of the book.
The code assumes you have the Intel Math Kernel Library (MKL) implementation of the BLAS. You will need to set your paths so the compiler can find the MKL headers and libraries.
If you have a different implementation the BLAS, you will need to edit cilkplus/blas_leaf.h as follows:
Replace #include "mkl.h" with the header for your BLAS.
Perhaps edit the wrappers if your BLAS does not have the usual C interface.

Listing File Summary
15.1 cilkplus/potrf_cilk.h Recursive Cholesky decomposition.
15.2 cilkplus/trsm_cilk.h Parallel triangular solve in Cilk Plus.
15.3 cilkplus/syrk_cilk.h Parallel symmetric rank update in Cilk Plus.