We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PTX compilation fails when using the not bitwise operator "~" on an int :
.version 7.6 .target sm_61 .address_size 64 .visible .entry s0_t0_invert_arrays_intarray_arrays_intarray_4096(.param .u64 .ptr .global .align 8 kernel_context, .param .u64 .ptr .global .align 8 a, .param .u64 .ptr .global .align 8 b, .param .align 8 .u64 size) { .reg .s64 rsd<3>; .reg .u32 rui<5>; .reg .u64 rud<7>; .reg .pred rpb<2>; .reg .s32 rsi<8>; BLOCK_0: ld.param.u64 rud0, [kernel_context]; ld.param.u64 rud1, [a]; ld.param.u64 rud2, [b]; mov.u32 rui0, %nctaid.x; mov.u32 rui1, %ntid.x; mul.wide.u32 rud3, rui0, rui1; cvt.s32.u64 rsi0, rud3; mov.u32 rui2, %tid.x; mov.u32 rui3, %ctaid.x; mad.lo.s32 rsi1, rui3, rui1, rui2; BLOCK_1: mov.s32 rsi2, rsi1; LOOP_COND_1: setp.lt.s32 rpb0, rsi2, 4096; @!rpb0 bra BLOCK_3; BLOCK_2: add.s32 rsi3, rsi2, 6; cvt.s64.s32 rsd0, rsi3; shl.b64 rsd1, rsd0, 2; add.u64 rud4, rud1, rsd1; ld.global.s32 rsi4, [rud4]; add.u64 rud5, rud2, rsd1; not.rn.b32 rsi5, rsi4; st.global.s32 [rud5], rsi5; add.s32 rsi6, rsi0, rsi2; mov.s32 rsi2, rsi6; bra.uni LOOP_COND_1; BLOCK_3: ret; } [TornadoVM-PTX-JNI] ERROR : cuModuleLoadData -> Returned: 218 PTX to cubin JIT compilation failed! (218) PTX JIT compilation failed! [Bailout] Running the sequential implementation. Enable --debug to see the reason.
Just run the following code :
import uk.ac.manchester.tornado.api.ImmutableTaskGraph; import uk.ac.manchester.tornado.api.TaskGraph; import uk.ac.manchester.tornado.api.TornadoExecutionPlan; import uk.ac.manchester.tornado.api.annotations.Parallel; import uk.ac.manchester.tornado.api.enums.DataTransferMode; import uk.ac.manchester.tornado.api.types.arrays.IntArray; public class Main { public static void invert(IntArray a, IntArray b, int size) { for (@Parallel int i = 0; i < size; i++) { b.set(i, ~a.get(i)); } } public static void main(String[] args) { int size = 4096; IntArray a = new IntArray(size); IntArray b = new IntArray(size); a.init(1); b.init(0); TaskGraph graph = new TaskGraph("s0") .transferToDevice(DataTransferMode.FIRST_EXECUTION, a, b) .task("t0", Main::invert, a, b, size) .transferToHost(DataTransferMode.EVERY_EXECUTION, b); ImmutableTaskGraph immutableTaskGraph = graph.snapshot(); TornadoExecutionPlan executionPlan = new TornadoExecutionPlan(immutableTaskGraph); executionPlan.execute(); } }
It should compile and run normally.
This works fine with an opencl backend.
The text was updated successfully, but these errors were encountered:
Hi @Benco11-developement , Thank you for the report. We will take a look
Sorry, something went wrong.
No branches or pull requests
Describe the bug
PTX compilation fails when using the not bitwise operator "~" on an int :
How To Reproduce
Just run the following code :
Expected behavior
It should compile and run normally.
Computing system setup (please complete the following information):
Additional context
This works fine with an opencl backend.
The text was updated successfully, but these errors were encountered: