-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocking Issue when running on K8s on GCP #1733
Comments
richiesgr
changed the title
Strange Issue when running on K8s on GCP
Blocking Issue when running on K8s on GCP
Dec 2, 2020
This is a bit weird, maybe the file creation has some delays on GCP (or
directory permission, or system quota)? If it's intermittent problem,
maybe you can take a look at the last few classes in the calling stack to
see whether there is race conditions and whether there are debug flags to
turn on for debugging.
…On Wed, Dec 2, 2020 at 1:05 AM Richard Grossman ***@***.***> wrote:
Hi
I run secor on google cloud kubernetes.
The nodes pool use preemptive machine meaning the machine restart every
24H.
I've deployed 7 pods
The data in kafka is avro and I write Avro files, the files are uploaded
in bucket in GCS.
*I read data from 5 topics togethers, the data in the topics are filled
using Mirror Maker*
On some pod After a some hours I get an exception and this is cause the
pod to Crashloop :
java.lang.RuntimeException: Failed to write message Message <message binary>
at com.pinterest.secor.consumer.Consumer.handleWriteError(Consumer.java:272)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:232)
at com.pinterest.secor.consumer.Consumer.run(Consumer.java:164)
Caused by: java.io.FileNotFoundException: /mnt/secor_csv/message_logs/partition/1_25/prod-og-monitoring_agg_impressions_sg/dt=2020-12-02/hr=02/1_229_00000000000189724222.gz (No such file or directory)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
at org.apache.avro.file.SyncableFileOutputStream.<init>(SyncableFileOutputStream.java:58)
at org.apache.avro.file.DataFileWriter.create(DataFileWriter.java:134)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory$AvroFileWriter.<init>(AvroFileReaderWriterFactory.java:131)
at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory.BuildFileWriter(AvroFileReaderWriterFactory.java:73)
at com.pinterest.secor.util.ReflectionUtil.createFileWriter(ReflectionUtil.java:156)
at com.pinterest.secor.common.FileRegistry.getOrCreateWriter(FileRegistry.java:138)
at com.pinterest.secor.writer.MessageWriter.write(MessageWriter.java:104)
at com.pinterest.secor.consumer.Consumer.writeMessage(Consumer.java:256)
at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:229)
Only doing this using the AvroFileReaderWriterFactory
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1733>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABYJP73ULN6JNHKLW2TA3A3SSX7MTANCNFSM4UKGAMDA>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi
I run secor on google cloud kubernetes.
The nodes pool use preemptive machine meaning the machine restart every 24H.
I've deployed 7 pods
The data in kafka is avro and I write Avro files, the files are uploaded in bucket in GCS.
I read data from 5 topics togethers, the data in the topics are filled using Mirror Maker
On some pod After a some hours I get an exception and this is cause the pod to Crashloop
I CANNOT RECOVER FROM HERE. all the failing pod crash the same way when restarted
Only doing this using the AvroFileReaderWriterFactory
The text was updated successfully, but these errors were encountered: