You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
hello , I compiled the latest flahinfer in orin and reported an error.
[ 0%] Built target fpA_intB_cutlass_objs
[ 1%] Built target project_libbacktrace
[ 1%] Built target tvm_libinfo_objs
[ 10%] Built target prefill_kernels
[ 14%] Built target decode_kernels
[ 16%] Built target fpA_intB_gemm
[ 16%] Built target fpA_intB_gemm_tvm
[ 17%] Built target flash_attn
[ 26%] Built target tvm_runtime_objs
[ 26%] Built target flashinfer_tvm
[ 26%] Linking CXX shared library libtvm_runtime.so
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_4_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_4_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_6_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_6_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_8_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_8_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(batch_paged_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16_idtype_i32.cu.o): in function cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /usr/local/cuda/include/cuda/pipeline:242: multiple definition of cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here
The text was updated successfully, but these errors were encountered:
hello , I compiled the latest flahinfer in orin and reported an error.
[ 0%] Built target fpA_intB_cutlass_objs
[ 1%] Built target project_libbacktrace
[ 1%] Built target tvm_libinfo_objs
[ 10%] Built target prefill_kernels
[ 14%] Built target decode_kernels
[ 16%] Built target fpA_intB_gemm
[ 16%] Built target fpA_intB_gemm_tvm
[ 17%] Built target flash_attn
[ 26%] Built target tvm_runtime_objs
[ 26%] Built target flashinfer_tvm
[ 26%] Linking CXX shared library libtvm_runtime.so
/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_4_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_4_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_6_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_6_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_8_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_8_head_128_layout_1_posenc_1_dtypein_f16_dtypeout_f16.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /workspace/tvm/3rdparty/flashinfer/include/flashinfer/attention/cascade.cuh:149: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined here/usr/bin/ld: 3rdparty/flashinfer/libdecode_kernels.a(batch_paged_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16_idtype_i32.cu.o): in function
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity_impl(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)': /usr/local/cuda/include/cuda/pipeline:242: multiple definition of
cuda::__3::pipeline<(cuda::__3::thread_scope)2>::__barrier_try_wait_parity(cuda::__3::barrier<(cuda::__3::thread_scope)2, cuda::std::__3::__empty_completion>&, bool)'; 3rdparty/flashinfer/libdecode_kernels.a(single_decode_group_1_head_128_layout_1_posenc_0_dtypein_f16_dtypeout_f16.cu.o):/usr/local/cuda/include/cuda/pipeline:242: first defined hereThe text was updated successfully, but these errors were encountered: