The application of the Spark-MPI integrated platform for accelerating the NSLS-II data-intensive and compute-intensive beamline tasks and building near-real-time processing pipelines. The rationale as well as general description of this approach are provided in the NYSDS'17 presentation: "Building Near-Real-Time Processing Pipelines with the Spark-MPI platform", NYSDS, New York, August 7-9, 2017. The Spark-MPI approach is funded by the DOE ASCR SBIR grant.
The Spark-MPI approach is illustrated within the following examples addressing three major aspects of beamline processing applications :
-
db2sharp: data-intensive application that demonstrates the acceleration of the databroker interface for accessing and preprocessing large ptychographic datasets with the Spark parallel platform
-
sharp-mpi: MPI/GPU ptychographic reconstruction application resolving both the GPU memory and performance requirements of ptychographic experiments for processing large scans
-
kafka: composite end-to-end application that demonstrates the integration of the databroker interface, kafka, and spark-mpi platforms for building near-real-time beamline processing pipelines
-
databroker: unified interface for various data sources at NSLS-II
-
spark-mpi: data-intensive and compute-intensive platform
-
sharp-nsls2: GPU/MPI ptychographic application