Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

occasional SIGSEGV #156

Open
zp-stripe opened this issue Nov 10, 2023 · 7 comments
Open

occasional SIGSEGV #156

zp-stripe opened this issue Nov 10, 2023 · 7 comments

Comments

@zp-stripe
Copy link

Thank you for taking the time to help improve OpenJDK and Corretto.

If your request concerns a security vulnerability then please report it by email to [email protected] instead of here. (You can find more information regarding security issues at https://aws.amazon.com/security/vulnerability-reporting/.)

Otherwise, if your issue concerns OpenJDK and is not specific to Corretto we ask that you raise it to the OpenJDK community. Depending on your contributor status for OpenJDK, please use the JDK bug system or the appropriate mailing list for the given problem area or update project.

If your issue is specific to Corretto, then you are in the right place. Please proceed with the following.

Describe the bug

A clear and concise description of what the bug is.

#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=2405, tid=4116

---------------  T H R E A D  ---------------

Current thread (0x0000ffcb10179800):  JavaThread "20231017_192415_17827_dcx6n.2.41.0-24-81" [_thread_in_Java, id=4116, stack(0x0000ffc9ee400000,0x0000ffc9ee600000)]

Stack: [0x0000ffc9ee400000,0x0000ffc9ee600000],  sp=0x0000ffc9ee5fd890,  free space=2038k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  0xfffffffffffffffc
j  java.lang.invoke.LambdaForm$MH+0x000000c002aee000.invoke(Ljava/lang/Object;Ljava/lang/Object;I)Ljava/lang/Object;+42 [email protected]
j  java.lang.invoke.Invokers$Holder.linkToTargetMethod(Ljava/lang/Object;ILjava/lang/Object;)Ljava/lang/Object;+6 [email protected]
j  io.trino.$gen.PageFilter_20231017_192428_32448.filter(Lio/trino/spi/connector/ConnectorSession;Lio/trino/spi/Page;I)Z+119
j  io.trino.$gen.PageFilter_20231017_192428_32448.filter(Lio/trino/spi/connector/ConnectorSession;Lio/trino/spi/Page;)Lio/trino/operator/project/SelectedPositions;+61
J 97113 c2 io.trino.operator.project.PageProcessor.createWorkProcessor(Lio/trino/spi/connector/ConnectorSession;Lio/trino/operator/DriverYieldSignal;Lio/trino/memory/context/LocalMemoryContext;Lio/trino/operator/project/PageProcessorMetrics;Lio/trino/spi/Page;Z)Lio/trino/operator/WorkProcessor; (242 bytes) @ 0x0000ffff87116a2c [0x0000ffff87116500+0x000000000000052c]
J 97114 c2 io.trino.operator.ScanFilterAndProjectOperator$SplitToPages$$Lambda$4237+0x000000c002614000.apply(Ljava/lang/Object;)Ljava/lang/Object; (16 bytes) @ 0x0000ffff87115bb4 [0x0000ffff87115b40+0x0000000000000074]
J 53192 c2 io.trino.operator.WorkProcessorUtils$$Lambda$3603+0x000000c0022f4690.process(Ljava/lang/Object;)Lio/trino/operator/WorkProcessor$TransformationState; (9 bytes) @ 0x0000ffff817e298c [0x0000ffff817e2940+0x000000000000004c]
J 116578 c2 io.trino.operator.WorkProcessorUtils$3.process()Lio/trino/operator/WorkProcessor$ProcessState; (226 bytes) @ 0x0000ffff899b08f0 [0x0000ffff899b06c0+0x0000000000000230]
J 116578 c2 io.trino.operator.WorkProcessorUtils$3.process()Lio/trino/operator/WorkProcessor$ProcessState; (226 bytes) @ 0x0000ffff899b077c [0x0000ffff899b06c0+0x00000000000000bc]
J 116578 c2 io.trino.operator.WorkProcessorUtils$3.process()Lio/trino/operator/WorkProcessor$ProcessState; (226 bytes) @ 0x0000ffff899b077c [0x0000ffff899b06c0+0x00000000000000bc]
J 37980 c2 io.trino.operator.WorkProcessorUtils$BlockingProcess.process()Lio/trino/operator/WorkProcessor$ProcessState; (75 bytes) @ 0x0000ffff7fb39e6c [0x0000ffff7fb39dc0+0x00000000000000ac]
J 31748 c2 io.trino.operator.WorkProcessorUtils$$Lambda$3605+0x000000c0022f5130.process(Ljava/lang/Object;)Lio/trino/operator/WorkProcessor$TransformationState; (8 bytes) @ 0x0000ffff7f35eaa0 [0x0000ffff7f35ea00+0x00000000000000a0]
J 116578 c2 io.trino.operator.WorkProcessorUtils$3.process()Lio/trino/operator/WorkProcessor$ProcessState; (226 bytes) @ 0x0000ffff899b08f0 [0x0000ffff899b06c0+0x0000000000000230]
J 116578 c2 io.trino.operator.WorkProcessorUtils$3.process()Lio/trino/operator/WorkProcessor$ProcessState; (226 bytes) @ 0x0000ffff899b077c [0x0000ffff899b06c0+0x00000000000000bc]
J 40576 c2 io.trino.operator.WorkProcessorUtils$$Lambda$4067+0x000000c0023c8000.process()Lio/trino/operator/WorkProcessor$ProcessState; (12 bytes) @ 0x0000ffff7ffc3064 [0x0000ffff7ffc2fc0+0x00000000000000a4]
J 40603 c2 io.trino.operator.WorkProcessorUtils$$Lambda$4069+0x000000c0023c8450.process()Lio/trino/operator/WorkProcessor$ProcessState; (12 bytes) @ 0x0000ffff7ffe03c0 [0x0000ffff7ffe0300+0x00000000000000c0]
J 40575 c2 io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput()Lio/trino/spi/Page; (41 bytes) @ 0x0000ffff7ffc2264 [0x0000ffff7ffc21c0+0x00000000000000a4]
J 35196 c2 io.trino.operator.Driver.processInternal(Lio/trino/operator/OperationTimer;)Lcom/google/common/util/concurrent/ListenableFuture; (667 bytes) @ 0x0000ffff7f78ca78 [0x0000ffff7f78c640+0x0000000000000438]
J 65796 c2 io.trino.operator.Driver.lambda$process$8(JI)Lcom/google/common/util/concurrent/ListenableFuture; (266 bytes) @ 0x0000ffff82e52cb4 [0x0000ffff82e52880+0x0000000000000434]
J 62786 c2 io.trino.operator.Driver$$Lambda$3696+0x000000c002303a48.get()Ljava/lang/Object; (16 bytes) @ 0x0000ffff81ae3c88 [0x0000ffff81ae3c40+0x0000000000000048]
J 37105 c2 io.trino.operator.Driver.process(Lio/airlift/units/Duration;I)Lcom/google/common/util/concurrent/ListenableFuture; (93 bytes) @ 0x0000ffff7f8f9750 [0x0000ffff7f8f9180+0x00000000000005d0]
J 37263 c2 io.trino.operator.Driver.processForDuration(Lio/airlift/units/Duration;)Lcom/google/common/util/concurrent/ListenableFuture; (9 bytes) @ 0x0000ffff7f9304fc [0x0000ffff7f9304c0+0x000000000000003c]
J 105866 c2 io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(Lio/airlift/units/Duration;)Lcom/google/common/util/concurrent/ListenableFuture; (84 bytes) @ 0x0000ffff87ff0d74 [0x0000ffff87ff0c40+0x0000000000000134]
J 50352 c2 io.trino.execution.executor.PrioritizedSplitRunner.process()Lcom/google/common/util/concurrent/ListenableFuture; (355 bytes) @ 0x0000ffff8058762c [0x0000ffff80586d00+0x000000000000092c]
J 64685% c2 io.trino.execution.executor.TaskExecutor$TaskRunner.run()V (621 bytes) @ 0x0000ffff82bf5fc4 [0x0000ffff82bf5880+0x0000000000000744]
j  io.trino.$gen.Trino_414_stripe_6____20231017_085628_2.run()V+4
j  java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+92 [email protected]
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 [email protected]
j  java.lang.Thread.run()V+11 [email protected]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x7c1c74]  JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, JavaThread*)+0x244
V  [libjvm.so+0x7c3280]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, JavaThread*)+0x180
V  [libjvm.so+0x874ce0]  thread_entry(JavaThread*, JavaThread*)+0x70
V  [libjvm.so+0xdb0aa8]  JavaThread::thread_main_inner()+0xa8
V  [libjvm.so+0xdb5778]  Thread::call_run()+0xb8
V  [libjvm.so+0xb45734]  thread_native_entry(Thread*)+0xdc
C  [libpthread.so.0+0x7624]  start_thread+0x184

To Reproduce

Steps and (source) code to reproduce the behavior.
Unable to reproduce consistently

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

Platform information

OS: Amazon Linux 2
Version "Corretto-17.0.9.8.1

Additional context

Add any other context about the problem here.

For VM crashes, please attach the error report file. By default the file name is hs_err_pidpid.log, where pid is the process ID of the process.

@eastig
Copy link
Member

eastig commented Nov 13, 2023

Hi @zp-stripe,
Thank you for reporting the issue.
Could you please attach hs_err_*.log file if you have them?
In the provided stack trace I see [email protected] which mean Corretto 17.0.8. However in Platform information the specified version is 17.0.9.
Could you please check the crash happens on Corretto 17.0.9.8.1?

Thanks

@zp-stripe
Copy link
Author

I got that version from running java --version as mentioned in the prompt:

zp@host:~$ java --version
openjdk 17.0.9 2023-10-17 LTS
OpenJDK Runtime Environment Corretto-17.0.9.8.1 (build 17.0.9+8-LTS)
OpenJDK 64-Bit Server VM Corretto-17.0.9.8.1 (build 17.0.9+8-LTS, mixed mode, sharing)

Here's another log snippet that confirms the version:

[2023-11-14 22:07:48.227082] # A fatal error has been detected by the Java Runtime Environment:
[2023-11-14 22:07:48.227090] #
[2023-11-14 22:07:48.227107] # SIGSEGV (0xb) at pc=0x0000000000000000, pid=2398, tid=4130
[2023-11-14 22:07:48.227114] #
[2023-11-14 22:07:48.227127] # JRE version: OpenJDK Runtime Environment Corretto-17.0.9.8.1 (17.0.9+8) (build 17.0.9+8-LTS)
[2023-11-14 22:07:48.227163] # Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.9.8.1 (17.0.9+8-LTS, mixed mode, sharing, tiered, compressed class ptrs, g1 gc, linux-aarch64)
[2023-11-14 22:07:48.227172] # Problematic frame:
[2023-11-14 22:07:48.227180] # C 0xfffffffffffffffc

maybe the minor version changed recently, not sure.

I don't have an hs_err_ file on hand right now because the hosts were replaced, but I can try to get one soon when the problem reoccurs and I will attach it here. Thanks.

@zp-stripe
Copy link
Author

hs_err_pid2392.log

Here is the hs_err file

@eastig
Copy link
Member

eastig commented Nov 15, 2023

@zp-stripe According to the hs_err file, you are using Trino-414. On https://github.com/trinodb/trino I see the latest version is 433. Could you please check if the crash happens on the version 433?

@simonis
Copy link
Contributor

simonis commented Nov 16, 2023

The crashes you've reported are all on aarch64. Have you also observed them on x86_64 or are you running exclusively on aarch64?

@feser
Copy link

feser commented Dec 1, 2023

We got SIGSEGV after upgrading to 17.0.9.8.1.
Do you think it is relevant to this issue or async profiler issue which was supposed to be fixed with 17.0.9?

Unfortunately, I can not get the hs_err_pid1.log.


[error occurred during error reporting (), id 0xb, SIGSEGV (0xb) at pc=0x00007f414ee0623b]

#

# https://github.com/corretto/corretto-17/issues/

# If you would like to submit a bug report, please visit:

#

# /tmp/hs_err_pid1.log

# An error report file with more information is saved as:

#

# The JFR repository may contain useful JFR files. Location: /tmp/2023_11_30_12_57_36_1

#

# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to //core.1)

#

# V [libjvm.so+0x253889] forte_fill_call_trace_given_top(JavaThread*, ASGCT_CallTrace*, int, frame) [clone .isra.20]+0x15d

# Problematic frame:

# Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.9.8.1 (17.0.9+8-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)

# JRE version: OpenJDK Runtime Environment Corretto-17.0.9.8.1 (17.0.9+8) (build 17.0.9+8-LTS)

#

# SIGSEGV (0xb) at pc=0x00007f414da20889, pid=1, tid=578

#

# A fatal error has been detected by the Java Runtime Environment:

#

@benty-amzn
Copy link
Contributor

Unfortunately, if that's the full output available and we don't have the hs_err, it's nearly impossible to say. That log doesn't specify where the crash occurred

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants