You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ERROR: /github/home/.cache/bazel/_bazel_root/197a057057a49e5811107144e2d78508/external/xla/xla/stream_executor/cuda/BUILD:450:19: no such target '@local_config_cuda//cuda:implicit_cuda_headers_dependency': target 'implicit_cuda_headers_dependency' not declared in package 'cuda' defined by /github/home/.cache/bazel/_bazel_root/197a057057a49e5811107144e2d78508/external/local_config_cuda/cuda/BUILD (Tip: use query "@local_config_cuda//cuda:*" to see all the targets in that package) and referenced by '@xla//xla/stream_executor/cuda:delay_kernel_cuda_cuda'
this @local_config_cuda is defined by using upstream's (https://github.com/google/tsl) cuda_configure starlack function:
like this:
cuda_configure function is supposed to setup the local_config_cuda to have the build target that tsl needs. But this deprecated non-hermetic version did not do that.
ERROR: /github/home/.cache/bazel/_bazel_root/197a057057a49e5811107144e2d78508/external/llvm-project/llvm/BUILD.bazel:251:11: Compiling llvm/lib/Support/Valgrind.cpp [for tool] failed: undeclared inclusion(s) in rule '@llvm-project//llvm:Support':
this rule is missing dependency declarations for the following files included by 'llvm/lib/Support/Valgrind.cpp':
'/usr/lib/clang/11.0.1/include/stddef.h'
'/usr/lib/clang/11.0.1/include/__stddef_max_align_t.h'
Which is weird because stddef.h is a system header and bazel should not ask for extra BUILD dependency declared for this.
This post in stackoverflow
says that we should clean bazel cache. Which we did by adding bazel clean --expunge right before the build, and it still doesnt work.
🐛 Bug
After updating XLA pin from 32ebd694c4d0442e241d76324ff1a721831366b4 to 590cd6fcd1ed24ab9cf494789a0fc524b94a4a6a in PR https://github.com/pytorch/xla/pull/8079/files
Our CI has the following failure:
https://github.com/pytorch/xla/actions/runs/11060810258/job/30732124138?pr=8079 ? the object that is failed to build is bazel build @xla//xla/pjrt/c:pjrt_c_api_gpu_plugin.so which is not our target.
The exact error is
this
@local_config_cuda
is defined by using upstream's (https://github.com/google/tsl)cuda_configure
starlack function:like this:
this bit of code is copied by following this deprecated section of this doc: https://github.com/openxla/xla/blob/main/docs/hermetic_cuda.md#deprecated-non-hermetic-cudacudnn-usage
Current theory:
cuda_configure function is supposed to setup the
local_config_cuda
to have the build target that tsl needs. But this deprecated non-hermetic version did not do that.Current tried actions:
We tried to follow the hermetic cuda setup described in this doc: https://github.com/openxla/xla/blob/main/docs/hermetic_cuda.md#deprecated-non-hermetic-cudacudnn-usage
However, it requires the use of clang compiler instead of gcc.
I am attempting to use clang, but this line that forces gcc claims that clang has issues:
xla/.bazelrc
Line 27 in 940bee4
With clang
it produces this error:
Which is weird because
stddef.h
is a system header and bazel should not ask for extra BUILD dependency declared for this.This post in stackoverflow
says that we should clean bazel cache. Which we did by adding
bazel clean --expunge
right before the build, and it still doesnt work.The latest CI with the above change is: https://github.com/pytorch/xla/actions/runs/11115985671/job/30885415097?pr=8079
The text was updated successfully, but these errors were encountered: