-
Notifications
You must be signed in to change notification settings - Fork 15
Running InfoSphere Streams benchmark
Zubair Nabi edited this page Apr 2, 2015
·
10 revisions
Before you begin, create the dataset required by the InfoSphere Streams benchmark: [Create dataset for InfoSphere Streams benchmark](Create dataset for InfoSphere Streams benchmark)
The StreamsEmailBenchmark project contains the InfoSphere streams application for processing the emails.
- Copy your serialized/compressed dataset (obtained using StreamsPrepareDataset) to
StreamsEmailBenchmark/data
Note: Naming convention should be filename0.av
to filename<parallelism>.av
For instance, if you want to process two files in parallel, they should be named, filename0.av
and filename1.av
To build the application, go to the root directory of StreamsEmailBenchmark, and type make all PARALLELISM=<parallelism>
at the command line.
To run the application:
- Make sure a streams instance is created and started
- To submit the job to the streams instance:
streamtool submitjob output/Main/Distributed/Main.adl -P filename=<input_file_name> -P windowTime=<flush_interval_for_metrics_in_secs> -P printWindowMetrics=<yes_or_no>
- Metrics will be dumped to
stdout
in case of standalone execution and to the logs in case of distributed execution - CPU Time can be obtained by visually inspecting the SPL graph in Streams Studio