Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM and SPIRV-LLVM-Translator pulldown (WW50 2024) #16372

Merged
merged 2,298 commits into from
Dec 18, 2024
Merged

Conversation

iclsrc
Copy link
Contributor

@iclsrc iclsrc commented Dec 15, 2024

callumfare and others added 30 commits November 25, 2024 11:34
… (#108413)

Introduce `offload-tblgen` and an initial implementation of a subset of
the new API. The tablegen files are intended to be the single source of
truth for the new API, with the header files, documentation, and others
bits of source all automatically generated.

**TODO** (based on review feedback so far):
- [x] Check in the generated headers
- [x] Add an `offload-generate` target to trigger the generation rather
than building them every time
- [x] Decide how error handling should work
  - [x] Finish up new error handling implementation 
- [x] Decide naming convention
- [x] Add testing for the new API
- [x] Add tablegen specific testing
- [x] clang-tidy and use llvm:: types when possible
- [x] Add optional code location arguments
- [x] Avoid multiple returns from one function

### offload-tblgen

See the included
[README](https://github.com/callumfare/llvm-project/blob/d80db06491d85444bb6f7e59d8068a22cef3a6b4/offload/new-api/API/README.md)
for more information on how the API definition and generation works. I'm
happy to answer any questions about it and plan to walk through it in a
future LLVM Offload call.

It should be noted that struct definitions have not been fully
implemented/tested as they aren't used by the initial API definitions,
but finishing that off in the future shouldn't be too much work.

The tablegen tooling has been designed to be easily extended with new
backends, using the classes in `RecordTypes.hpp` to abstract over the
tablegen records.

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
$ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git
$ cd unified-runtime
$ mkdir build
$ cd build
$ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \
    -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \
    -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory>
$ ninja urinfo
export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries>
$ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
[cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030  [12030]
# Demo with tracing
$ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS
---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS
...
```


### Open questions and future work
* The new API is implemented in a separate library
(`liboffload_new.so`). It could just as easily be part of the existing
`libomptarget` library - I have no strong feelings on which is better.
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
When linking programs with `eld`, we get a link error like below:

Error:
/inst/clang+llvm-19.1.0-cross-hexagon-unknown-linux-musl/x86_64-linux-gnu/bin/../target/hexagon-unknown-linux-musl//usr/lib/libc.a(scalbn.lo)(.text.scalbn+0x3c):
undefined reference to `__hexagon_muldf3'

libc has references to the clang_rt builtins library, so the order of
the libraries should be reversed.
Separates the stack allocations from prologue in preparation for the
stack clash protection support.
This patch support for intrinsics in clang, as well as assembly
instructions in the backend.

Co-authored-by: Sirish Pande <[email protected]>
The cfi_offset is incorrect for the RVV registers when cm.push
is used.
OPSEL ASM Syntax: opsel:[x,y,z]
where,
    opsel[x] = Inst{11} = src0_modifier{2}
    opsel[y] = Inst{12} = src1_modifier{2}
    opsel[z] = Inst{14} = src0_modifier{3}
Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax.

Co-authored-by: Pravin Jagtap <[email protected]>
OPSEL ASM Syntax: opsel:[x,y,z]
where,
    opsel[x] = Inst{11} = src0_modifier{2}
    opsel[y] = Inst{12} = src1_modifier{2}
    opsel[z] = Inst{14} = src0_modifier{3}
Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax.

Co-authored-by: Pravin Jagtap <[email protected]>
…17382)

OPSEL[3] selects low/high 16 bits of dest write.

Co-authored-by: Pravin Jagtap <[email protected]>
…7383)

OPSEL[0] selects srcword to read.

Co-authored-by: Pravin Jagtap <[email protected]>
… (#117384)

OPSEL ASM Syntax: opsel:[x,y,z]
where,
    opsel[z] = Inst{14} = src0_modifier{3}

Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax.

Co-authored-by: Pravin Jagtap <[email protected]>
The standard mandates that this returns the width of the type, which is
the number of bits in the value. For bool, that's required to be `1`
explicitly.

Fixes #117348
This adds the f32/f64/f16/bf16 test cases for below pattern :

`fmul x, select(y, A, B)`
with just one use of select Inst above.

It acts as pre-commit tests for dagCombining above pattern into cheaper
ldexp in case of non-inlline 32 bit-constants. (#111109)
…ffset for RVV CSRs. (#117408)

getCalleeSavedStackSize() already contains RVPushStackSize. Don't
subtract it again.
OPSEL ASM Syntax for v_cvt_scalef32_pk_f32_fp4 : opsel:[x,y,z]
where, x & y i.e. OPSEL[1 : 0] selects which src_byte to read.

OPSEL ASM Syntax for v_cvt_scalef32_pk_fp4_f32 : opsel:[a,b,c,d]
where, c & d i.e. OPSEL[3 : 2] selects which dst_byte  to write.

Co-authored-by: Pravin Jagtap <[email protected]>
This relands #86209 which was reverted because ./bin/llvm no longer
accepted test paths in the source tree instead of the build tree. This was
happening because `add_subdirectory(${LLVM_MAIN_SRC_DIR}/utils/llvm-lit`
was called before all tsst suites were registered, and therefore it was
missing the source->build dir mappings.

Original commit message:

I am currently trying to test the LLVM runtimes (including compiler-rt)
against an installed LLVM tree rather than a build tree (since that is
no longer available). Currently, the runtimes build of compiler-rt assumes
that LLVM_BINARY_DIR is writable since it uses configure_file() to write
there during the CMake configure stage. Instead, generate this file inside
CMAKE_CURRENT_BINARY_DIR, which will match LLVM_BINARY_DIR when invoked
from llvm/runtimes/CMakeLists.txt.

I also needed to make a minor change to the hwasan tests: hwasan_symbolize
was previously found in the LLVM_BINARY_DIR, but since it is generated as
part of the compiler-rt build it is now inside the CMake build directory
instead. I fixed this by passing the output directory to lit as
config.compiler_rt_bindir and using llvm_config.add_tool_substitutions().

For testing that we no longer write to the LLVM install directory as
part of testing or configuration, I created a read-only bind mount and
configured the runtimes builds as follows:
```
$ sudo mount --bind --read-only ~/llvm-install /tmp/upstream-llvm-readonly
$ cmake -DCMAKE_BUILD_TYPE=Debug \
  -DCMAKE_C_COMPILER=/tmp/upstream-llvm-readonly/bin/clang \
  -DCMAKE_CXX_COMPILER=/tmp/upstream-llvm-readonly/bin/clang++ \
  -DLLVM_INCLUDE_TESTS=TRUE -DLLVM_ENABLE_ASSERTIONS=TRUE \
  -DCOMPILER_RT_INCLUDE_TESTS=TRUE -DCOMPILER_RT_DEBUG=OFF \
  -DLLVM_ENABLE_RUNTIMES=compiler-rt \
  -DCMAKE_DISABLE_FIND_PACKAGE_LLVM=TRUE \
  -DCMAKE_DISABLE_FIND_PACKAGE_Clang=TRUE \
  -G Ninja -S ~/upstream-llvm-project/runtimes \
  -B ~/upstream-llvm-project/runtimes/cmake-build-debug-llvm-git
```

Pull Request: llvm/llvm-project#114307
Retry of pr #109090 and #111099.

Co-authored-by: Alexander Richardson <[email protected]>
The default expansion tries to create an illegal integer type after
legalization.
The Z size was optional, meaning it matched with the 128-bit SSE instructions as well.

Noticed while triaging the strange perf numbers on #110308
…117303)

In the DXIL CreateHandle and CreateHandleFromBinding ops, resource
bindings are
indexed from the beginning of the binding space, not from the binding
itself.
Translate from an index into the binding to one from the beginning of
the space
when lowering to these operations.
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
  - Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required.
  - Parameter count is now calculated based on the unique hash count.
  - Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size.
  - Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time.
  - Moved a sorting operation outside of the loop.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
The only constructor in current use is the one that takes
IndexedMemProfData.  Likewise, the only accessor in current use is
takeMemProfData.
This come up in the context of pr 108889.  We always end up sinking
the value in the phi if we dispatched via a switch, but not if we'd
dispatched via a branch.  This is purely an artifact of current
lowering.
…#117605)

We can move the logic from adjustStackForRVV into adjustReg, which
results in the remaining logic being trivially inlined to the two
callers and allows a duplicate copy of the same logic in
eliminateFrameIndex to be pruned.
…7418)

OPSEL ASM Syntax for v_cvt_scalef32_pk_{f|bf}16_fp4 : opsel:[x,y,z]
where, x & y i.e. OPSEL[1 : 0] selects which src_byte to read.

Note: Conventional Inst{13} i.e. OPSEL[2] is ignored in asm syntax.

Co-authored-by: Pravin Jagtap <[email protected]>

Co-authored-by: Pravin Jagtap <[email protected]>
This patch fixes:

  llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp:476:25: error: unused
  variable 'ST' [-Werror,-Wunused-variable]
Make it homogenous with other runtime entry points.
Copy link
Contributor

@vmaksimo vmaksimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approve for llvm-spirv changes

@jsji
Copy link
Contributor

jsji commented Dec 17, 2024

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a /merge. Thanks.

@sarnex
Copy link
Contributor

sarnex commented Dec 17, 2024

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:27:32 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:27:34 PM UTC --- Merge failed with error: Please check whether the PR is mergeable

@sarnex
Copy link
Contributor

sarnex commented Dec 17, 2024

@jsjj Git version issue again?

@sarnex
Copy link
Contributor

sarnex commented Dec 17, 2024

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:41:05 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:41:06 PM UTC --- Merge failed with error: Please check whether the PR is mergeable

@sarnex
Copy link
Contributor

sarnex commented Dec 17, 2024

/merge

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:47:17 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 17, 2024

Tue 17 Dec 2024 06:47:19 PM UTC --- Merge failed with error: Please check whether the PR is mergeable

@jsji
Copy link
Contributor

jsji commented Dec 17, 2024

@DoyleLi The bot is still not working, please have a look. Thanks.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 18, 2024

Wed 18 Dec 2024 02:03:33 AM UTC --- Merge failed with error: Please check whether the PR is mergeable

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 18, 2024

Wed 18 Dec 2024 02:19:50 AM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

@bb-sycl
Copy link
Contributor

bb-sycl commented Dec 18, 2024

Wed 18 Dec 2024 02:25:13 AM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

@bb-sycl bb-sycl merged commit d21e679 into sycl Dec 18, 2024
31 of 32 checks passed
@DoyleLi
Copy link
Contributor

DoyleLi commented Dec 18, 2024

@DoyleLi The bot is still not working, please have a look. Thanks.

Hi @jsji the issue is resolved and will not happen in the future. I retriggered the job and verified the whole process.

@jsji
Copy link
Contributor

jsji commented Dec 18, 2024

@DoyleLi The bot is still not working, please have a look. Thanks.

Hi @jsji the issue is resolved and will not happen in the future. I retriggered the job and verified the whole process.

Thanks @DoyleLi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disable-lint Skip linter check step and proceed with build jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.