high-performance-spark-examples

Examples for High Performance Spark

We are in the progress of updata this for Spark 3.5+ and the 2ed edition of our book!

Building

Most of the examples can be built with sbt, the C and Fortran components depend on gcc, g77, and cmake.

Tests

The full test suite depends on having the C and Fortran components built as well as a local R installation available.

The most "accuate" way of seeing how we run the tests is to look at the .github workflows

History Server

The history server can be a great way to figure out what's going on.

By default the history server writes to /tmp/spark-events so you'll need to create that directory if not setup with

mkdir -p /tmp/spark-events

The scripts for running the examples generally run with the event log enabled.

You can set the SPARK_EVENTLOG=true before running the scala tests and you'll get the history server too!

e.g.

SPARK_EVENTLOG=true sbt test

If you want to run just a specific test you can run testOnly

Then to view the history server you'll want to launch it using the ${SPARK_HOME}/sbin/start-history-server.sh then you can go to your local history server

Name		Name	Last commit message	Last commit date
Latest commit History 560 Commits
.github/workflows		.github/workflows
accelerators		accelerators
conf		conf
core/src		core/src
data		data
high_performance_pyspark		high_performance_pyspark
iceberg-workshop-solutions		iceberg-workshop-solutions
migration		migration
misc		misc
native/src		native/src
project		project
python		python
resources		resources
shell-scripts		shell-scripts
sql		sql
target-validator		target-validator
.gitignore		.gitignore
.jvmopts		.jvmopts
.scalafix.conf		.scalafix.conf
Dockerfile		Dockerfile
Dockerfile-mini		Dockerfile-mini
LICENSE		LICENSE
README.md		README.md
appveyor.yml		appveyor.yml
build.sbt		build.sbt
build_container.sh		build_container.sh
c		c
env_setup.sh		env_setup.sh
run_container.sh		run_container.sh
run_pyspark_examples.sh		run_pyspark_examples.sh
run_sql_examples.sh		run_sql_examples.sh
scalastyle-config.xml		scalastyle-config.xml
se_complex.json		se_complex.json
se_simple.json		se_simple.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

high-performance-spark-examples

Building

Tests

History Server

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 5

Uh oh!

Languages

License

high-performance-spark/high-performance-spark-examples

Folders and files

Latest commit

History

Repository files navigation

high-performance-spark-examples

Building

Tests

History Server

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 5

Uh oh!

Languages

Packages