s3a error #8

webroboteu · 2019-05-25T19:57:46Z

in the example I have attached the following problem appears, which seems to be related to the management of shuffle in spark in the s3 context.
Did I confirm that the problem occurs or is it a configuration problem of mine?

ShuffleExample.scala.zip

webroboteu · 2019-05-25T20:02:58Z

this is the main error in cloudwatch

Caused by: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid directory for output-

18:48:23
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext$DirSelector.getPathForWrite(LocalDirAllocator.java:541)

18:48:23
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:627)

18:48:23
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.createTmpFileForWrite(LocalDirAllocator.java:640)

18:48:23
at org.apache.hadoop.fs.LocalDirAllocator.createTmpFileForWrite(LocalDirAllocator.java:221)

18:48:23
at org.apache.hadoop.fs.s3a.S3AOutputStream.(S3AOutputStream.java:91)

18:48:23
at org.apache.hadoop.fs.s3a.S3AFileSystem.create(S3AFileSystem.java:736)

18:48:23
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:914)

18:48:23
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:895)

18:48:23
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:792)

18:48:23
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:781)

18:48:23
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.writePartitionedFileToS3(BypassMergeSortShuffleWriter.java:269)

18:48:23
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.writePartitionedFile(BypassMergeSortShuffleWriter.java:223)

18:48:23
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:200)

18:48:23
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

18:48:23
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)

18:48:23
at org.apache.spark.scheduler.Task.run(Task.scala:99)

18:48:23
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)

18:48:23
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

18:48:23
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6

webroboteu · 2019-05-25T20:26:48Z

if I provide the configuration information I entered but followed the documentation

venkata91 · 2019-05-28T16:01:05Z

Hey @webroboteu
I remember facing this issue during its development. I wanted to get back to this issue and fix it but if you fast upload of s3a it should work fine. Can you try setting this flag spark.hadoop.fs.s3a.fast.upload true if its not already set? Also there have been lot of changes in the lambda environment from then to now like the private VPC and things like that which you might have seen in the other issues. Let me know if this works.

webroboteu · 2019-05-28T17:31:16Z

I had already tried these parameters without success. Now out of desperation I was thinking of bypassing the hadoop interface and managing the stream directly. Is your email on linkedin the one you posted on your profile? I would like to have you on my network to discuss the project

webroboteu · 2019-05-28T17:31:52Z

ll try again and let you know

webroboteu · 2019-05-29T12:01:23Z

if i want to recompile it you suggest to use your hadoop version 2.6.0-qds-0.4.13 but not the reference to your repository. Can you suggest something about version 2.8 for example?

venkata91 · 2019-05-29T22:14:54Z

Right. But you can just compile with the existing open source 2.6.0 hadoop version and just copy the hadoop-aws jar later to your binary that should work as well. This is a comment I added in another issue Compiling #2

Another easier workaround is to remove the pom.xml additions basically reverting the commit "Fix pom.xml to have the other Qubole repository location having 2.6.0... (2ca6c68)"

Build your package using this command - ./dev/make-distribution.sh --name spark-lambda-2.1.0 --tgz -Phive -Phadoop-2.7 -DskipTests

And finally add the below jars to classpath before starting spark-shell

1. wget http://central.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar
2. wget http://central.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.3/hadoop-aws-2.7.3.jar
Refer here - https://markobigdata.com/2017/04/23/manipulating-files-from-s3-with-apache-spark/

webroboteu · 2019-05-30T17:14:00Z

recompiling as you say I have the following error:
Exception in thread "dag-scheduler-event-loop" java.lang.NoSuchMethodError: com.amazonaws.http.AmazonHttpClient.disableStrictHostnameVerification()

webroboteu · 2019-05-30T20:44:30Z

I have a repository of a docker image: https://github.com/webroboteu/sparklambdadriver
I'm using hadoop version 2.7 and its dependencies.

webroboteu · 2019-05-31T21:13:13Z

With hadoop 2.9, referring to bundle 1.11.199 with these docker lines there is progress but I still have to confirm that it works on lambda context

RUN wget http://central.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.199/aws-java-sdk-bundle-1.11.199.jar
RUN wget http://central.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.9.0/hadoop-aws-2.9.0.jar
RUN rm /$SPARK_HOME/jars/aws*.jar
RUN mv aws-java-sdk-bundle-1.11.199.jar /$SPARK_HOME/jars
RUN mv hadoop-aws-2.9.0.jar /$SPARK_HOME/jars

webroboteu · 2019-05-31T21:47:55Z

with local execution now i have this problem: java.lang.NullPointerException
at org.apache.spark.util.Utils$.localFileToS3(Utils.scala:2517)
at org.apache.spark.shuffle.S3ShuffleBlockResolver.writeIndexFileAndCommit(S3ShuffleBlockResolver.scala:177)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:158)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

I update you

webroboteu · 2019-06-01T12:54:25Z

I'm in the right direction since I can now recompile it correctly. For some strange reason try to load the data from the same executorId 4775351731

java.io.FileNotFoundException: No such file or directory: s3: //webroboteuquboleshuffle/tmp/executor-driver-4775351731/30/shuffle_0_0_0.index

webroboteu closed this as completed May 25, 2019

webroboteu reopened this May 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3a error #8

s3a error #8

webroboteu commented May 25, 2019

webroboteu commented May 25, 2019

webroboteu commented May 25, 2019

venkata91 commented May 28, 2019 •

edited

Loading

webroboteu commented May 28, 2019

webroboteu commented May 28, 2019

webroboteu commented May 29, 2019 •

edited

Loading

venkata91 commented May 29, 2019

webroboteu commented May 30, 2019

webroboteu commented May 30, 2019

webroboteu commented May 31, 2019

webroboteu commented May 31, 2019

webroboteu commented Jun 1, 2019

s3a error #8

s3a error #8

Comments

webroboteu commented May 25, 2019

webroboteu commented May 25, 2019

webroboteu commented May 25, 2019

venkata91 commented May 28, 2019 • edited Loading

webroboteu commented May 28, 2019

webroboteu commented May 28, 2019

webroboteu commented May 29, 2019 • edited Loading

venkata91 commented May 29, 2019

webroboteu commented May 30, 2019

webroboteu commented May 30, 2019

webroboteu commented May 31, 2019

webroboteu commented May 31, 2019

webroboteu commented Jun 1, 2019

venkata91 commented May 28, 2019 •

edited

Loading

webroboteu commented May 29, 2019 •

edited

Loading