Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example spark-submit using spark-s3-shuffle whilst running with Dynamic Allocation #8

Open
awdavidson opened this issue Nov 7, 2022 · 1 comment

Comments

@awdavidson
Copy link

awdavidson commented Nov 7, 2022

It would be good to include an example in the readme. Whilst it maybe obvious what is required for some developers, others maybe unsure.

Using the spark-s3-shuffle whilst running an application with dynamic allocation may trip some people up. Typically when dynamic allocation is enabled you are also required to enable the shuffle service. This may not be available when running spark on kubernetes and executors will fail to register with the external shuffle service. The workaround for this is to enable shuffle tracking and configure the shuffle tracking timeout to ensure executors can be gracefully removed.

For example some additional configuration required:

--conf spark.executor.extraClassPath=some.jar      # this is required so executors are aware of the S3ShuffleManager etc
--conf spark.dynamicAllocation.enabled=true
--conf spark.dynamicAllocation.shuffleTracking.enabled=true
@pspoerri
Copy link
Contributor

pspoerri commented Nov 27, 2022

Good point! Thank you! These parameters are very well documented on the Spark Configuration page.

You might also need to set

--conf spark.dynamicAllocation.shuffleTracking.timeout=0 

since the stale executors might be kept around otherwise.

I need to spend some time on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants