Memory Problems with Spark-perf #99

yodha12 · 2016-01-29T15:57:04Z

I am running pyspark tests on cluster with 12 node, 20 cores on each nodes and 60gb memory per node. I am getting output of first few tests(sort, agg, count etc), and when it reaches to broadcast, job terminates. I assume it is because of lack of memory from .err file in result folder as ensureFreeSpace(4194304) called with curMem=610484012, maxMem=611642769. How can I increase maxMem value? This is my config/config.py file content.

COMMON_JAVA_OPTS = [
# Fraction of JVM memory used for caching RDDs.
JavaOptionSet("spark.storage.memoryFraction", [0.66]),
JavaOptionSet("spark.serializer", ["org.apache.spark.serializer.JavaSerializer"]),
JavaOptionSet("spark.executor.memory", ["9g"]),
and

Set driver memory here

SPARK_DRIVER_MEMORY = "20g"

It shows the running command as follows.

Setting env var SPARK_SUBMIT_OPTS: -Dspark.storage.memoryFraction=0.66 -Dspark.serializer=org.apache.spark.serializer.JavaSerializer -Dspark.executor.memory=9g -Dspark.locality.wait=60000000 -Dsparkperf.commitSHA=unknown
Running command: /nfs/15/soottikkal/local/spark-1.5.2-bin-hadoop2.6//bin/spark-submit --master spark://r0111.ten.osc.edu:7077 pyspark-tests/core_tests.py BroadcastWithBytes --num-trials=10 --inter-trial-wait=3 --num-partitions=400 --reduce-tasks=400 --random-seed=5 --persistent-type=memory --num-records=200000000 --unique-keys=20000 --key-length=10 --unique-values=1000000 --value-length=10 --broadcast-size=209715200 1>> results/python_perf_output__2016-01-28_23-35-54_logs/python-broadcast-w-bytes.out 2>> results/python_perf_output__2016-01-28_23-35-54_logs/python-broadcast-w-bytes.err

Is the spark-submit command taking memory as set in config.py here? maxMem is only 611mb which looks like 0.66*1gb of default memory setting of Spark. Changing spark.executor.memory or SPARK_DRIVER_MEMORY value in config/config.py has no effect on maxMem, but changing spark.storage.memoryFraction from 0.66 to 0.88 increases the MaxMem. How can I control maxMem value to get large memories that are already available in the cluster?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Problems with Spark-perf #99

Memory Problems with Spark-perf #99

yodha12 commented Jan 29, 2016

Memory Problems with Spark-perf #99

Memory Problems with Spark-perf #99

Comments

yodha12 commented Jan 29, 2016

Set driver memory here