Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Native Image] SIGILL Crash in GraalVM 23.0.1 on Amazon Linux 2023 with ARM64 Architecture Without -XX:UseSVE=0 #10458

Open
2 tasks done
francis-a opened this issue Jan 11, 2025 · 1 comment

Comments

@francis-a
Copy link

francis-a commented Jan 11, 2025

Describe the Issue

When running GraalVM version 23.0.1 on an ARM64-based system using Amazon Linux 2023, invoking the GraalVM java binary results in a fatal crash (SIGILL) unless the JVM argument -XX:UseSVE=0 is explicitly set. This appears to be related to the handling of SVE instructions on the platform.

Using the latest version of GraalVM can resolve many issues.

GraalVM Version

openjdk version "23.0.1" 2024-10-15
OpenJDK Runtime Environment GraalVM CE 23.0.1+11.1 (build 23.0.1+11-jvmci-b01)
OpenJDK 64-Bit Server VM GraalVM CE 23.0.1+11.1 (build 23.0.1+11-jvmci-b01, mixed mode, sharing)

Operating System and Version

amazonlinux:2023

Troubleshooting Confirmation

Run Command

/usr/lib/graalvm/bin/java -version

Expected Behavior

The java command should print the version information and exit without any errors.

Actual Behavior

The java command results in a fatal error with the following message:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGILL (0x4) at pc=0x0000ffff7a7d8ce8, pid=15, tid=16
#
# JRE version:  (23.0.1+11) (build )
# Java VM: OpenJDK 64-Bit Server VM (23.0.1+11-jvmci-b01, mixed mode, sharing, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
# Problematic frame:
# j  java.lang.System.registerNatives()V+0 java.base
#
# Core dump will be written. Default location: /project/core
#
# An error report file with more information is saved as:
# /project/hs_err_pid15.log
[0.020s][warning][os] Loading hsdis library failed
#
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
Aborted (core dumped)

Steps to Reproduce

Use the following Dockerfile to build an image on an ARM64 platform:

FROM --platform=linux/arm64 public.ecr.aws/amazonlinux/amazonlinux:2023

RUN yum -y update \
    && yum install -y unzip tar gzip gcc gcc-c++ gcc-gfortran \
    libcurl-devel openssl openssl-devel \
    zlib-devel glibc-static zlib-static \
    python3-pip \
    && rm -rf /var/cache/yum

# Graal VM
ENV GRAAL_VERSION 23.0.1
RUN curl -4 -L https://github.com/graalvm/graalvm-ce-builds/releases/download/jdk-${GRAAL_VERSION}/graalvm-community-jdk-${GRAAL_VERSION}_linux-aarch64_bin.tar.gz | tar -xvz
RUN mv graalvm-community-openjdk* /usr/lib/graalvm
ENV JAVA_HOME /usr/lib/graalvm

VOLUME /project
WORKDIR /project

ENTRYPOINT ["sh"]

Build the image:

docker build -t graalvm-bug-report .

Run the container:

docker run --rm -it graalvm-bug-report

Inside the container, run:

/usr/lib/graalvm/bin/java -version

Additional Context

Running /usr/lib/graalvm/bin/java -XX:UseSVE=0 -version in the docker image will output the version.

Run-Time Log Output and Error Messages

No response

@francis-a
Copy link
Author

francis-a commented Jan 12, 2025

After digging into this issue a bit more it looks like this may be something related to the host computer I am running my container in. I am using a M4 Mac and runing docker using Rancher Desktop.

After flipping between QEMU and vs virutization options without success I came across this tangentally related StackOverflow post: https://stackoverflow.com/questions/79312200/gdb-error-unable-to-fetch-sve-ssve-vector-length-invalid-argument-on-docker.

By starting docker using Colima and setting nestedVirtualization to true I am now able to run nativeCompile in my docker image.

I have to admit, I don't know enough about the root cause to understad if this is some kind of general issue with Apple M4s or if it is something specific to my setup. I also don't know if there is anything the GraalVM team could do to help with this issue. I would be really interested if anyone from the team who understands more could help out, is this anything that could be fixed with GraalVM itself?

I was also able to reproduce this issue using the standard GraalVM docker image on my local M4 Mac.

FROM --platform=linux/aarch64 ghcr.io/graalvm/native-image-community:23
ENTRYPOINT ["sh"]

For me running java --version from this image also fails with a SIGILL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant