Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seg Fault Error using Level Zero Compute Runtime 24.09.28717.12 and Profiler On #383

Open
jjfumero opened this issue Apr 22, 2024 · 0 comments
Assignees
Labels
bug Something isn't working level-zero

Comments

@jjfumero
Copy link
Member

Describe the bug

There is an error from the Level-JNI library when running with the profiler ON. The error is related to the JNI call to the function zeEventPoolCreate. This error can be also reproduced using the levelzero-jni as a standalone library.

The error stack is as follows:

Stack: [0x00007f451bc74000,0x00007f451bd74000],  sp=0x00007f451bd71ac8,  free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libze_intel_gpu.so.1+0x1300e5]
C  [libze_intel_gpu.so.1+0x114bc6]
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  uk.ac.manchester.tornado.drivers.spirv.levelzero.LevelZeroContext.zeEventPoolCreate_native(JLuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventPoolDescriptor;IJLuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventPoolHandle;)I+0 [email protected]
j  uk.ac.manchester.tornado.drivers.spirv.levelzero.LevelZeroContext.zeEventPoolCreate(JLuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventPoolDescriptor;IJLuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventPoolHandle;)I+9 [email protected]
j  uk.ac.manchester.tornado.drivers.spirv.timestamps.LevelZeroKernelTimeStamp.createEventPoolAndEvents(Luk/ac/manchester/tornado/drivers/spirv/levelzero/LevelZeroContext;Luk/ac/manchester/tornado/drivers/spirv/levelzero/LevelZeroDevice;Luk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventPoolHandle;IILuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeEventHandle;)V+35 [email protected]
j  uk.ac.manchester.tornado.drivers.spirv.timestamps.LevelZeroKernelTimeStamp.createEventTimer()V+59 [email protected]
j  uk.ac.manchester.tornado.drivers.spirv.graal.SPIRVLevelZeroInstalledCode.launchKernelWithLevelZero(JLuk/ac/manchester/tornado/drivers/spirv/levelzero/ZeKernelHandle;Luk/ac/manchester/tornado/drivers/spirv/graal/SPIRVLevelZeroInstalledCode$DeviceThreadScheduling;Luk/ac/manchester/tornado/drivers/spirv/graal/SPIRVLevelZeroInstalledCode$ThreadBlockDispatcher;)V+131 [email protected]
j  uk.ac.manchester.tornado.drivers.spirv.graal.SPIRVLevelZeroInstalledCode.launchWithoutDependencies(JLuk/ac/manchester/tornado/runtime/common/KernelStackFrame;Luk/ac/manchester/tornado/api/memory/XPUBuffer;Luk/ac/manchester/tornado/runtime/tasks/meta/TaskMetaData;J)I+194 [email protected]
j  uk.ac.manchester.tornado.runtime.interpreter.TornadoVMInterpreter.executeLaunch(Ljava/lang/StringBuilder;IIIJJLuk/ac/manchester/tornado/runtime/interpreter/TornadoVMInterpreter$XPUExecutionFrame;)I+904 [email protected]

This issue can only be reproduced using a recent version of the Intel Compute Runtime: https://github.com/intel/compute-runtime such as the 24.09.28717.12 for Fedora 39.

If I use a previous version (23.05.25593.18), there are no errors.

How To Reproduce

# Using the levelzero-jni repo: https://github.com/beehive-lab/levelzero-jni
./scripts/events.sh

Expected behavior

Run without errors.

Computing system setup (please complete the following information):

  • OS: Fedora 39
  • OpenCL and Driver versions
  • If applicable, PTX and CUDA Driver versions
  • If applicable, Level Zero & SPIR-V Versions:
  • TornadoVM commit id: 80d9a61
  • Linux Kernel: Linux 6.8.6-200.fc39.x86_64

Additional context

n/ a.

@jjfumero jjfumero self-assigned this Apr 22, 2024
@jjfumero jjfumero added bug Something isn't working level-zero labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working level-zero
Projects
Status: No status
Development

No branches or pull requests

2 participants