Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from llvm:main #5546

Open
wants to merge 665 commits into
base: main
Choose a base branch
from
Open

[pull] main from llvm:main #5546

wants to merge 665 commits into from

Conversation

pull[bot]
Copy link

@pull pull bot commented Jan 16, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Jan 16, 2025
necto and others added 29 commits January 20, 2025 10:02
#109430)

After 1595988
diag::warn_undefined_reinterpret_cast started raising on
non-instantiated template functions without sufficient knowledge whether
the reinterpret_cast is indeed UB.
This prevents creating range class instances from temporaries.
This is the next step to move the CMake cache file builder closer to the
build configuration we care about downstream.
…123571)

Don't report dead pointers if we've checking for a potential constant
expression.
Use the methods accepting LLVMContext instead.
… ARM64X (#123346)

Includes handling for ARM64X relocations relative to a symbol.
Adding SPIRV to LLVM_ALL_TARGETS
(#119653) revealed a series of
minor compilation problems and sanitizer complaints. This PR is to
address the problem.
This registers the pass with PassRegistry so we can use -start-before
and other options for machine-function-splitter.
This is to avoid race conditions with other tests.
If Polly is built with LLVM_POLLY_LINK_INTO_TOOLS=ON (the default for
monorepo builds), then Polly will become a dependency of the
LLVMExtensions component, which is part of LLVMExports. As such, all the
Polly libraries also have to be part of LLVMExports.

However, if Polly is built with LLVM_POLLY_LINK_INTO_TOOLS=OFF, we also
end up adding Polly libraries to LLVMExports. This is undesirable, as it
adds a hard dependency from llvm on polly.

Fix this by only exporting polly libraries from LLVMExports if
LLVM_POLLY_LINK_INTO_TOOLS is enabled.
…tests (NFC)

Allow arbitrary attributes, including those with arguments.
The only thing these tests care about from an ABI perspective is sret,
don't also test all the optimization attributes.
…ectorInstrCost. NFCI

This should not effect the result, unless the getArithmeticInstrCost and
getVectorInstrCost routines learn to produce different costs (with CostKind =
CodeSize for example). The -1 lanes prevent 0 lanes from (incorrectly) being
marked as free.
To reduce diffs in an upcoming change.
When `RecordType` is converted to corresponding `DIType`, we cache the
information to avoid doing the conversion again.

Our conversion of `RecordType` looks like this:

`ConvertRecordType(RecordType Ty)`
1. If type `Ty` is already in the cache, then return the corresponding
item.
2. Create a place holder `DICompositeTypeAttr` (called `ty_self` below)
for `Ty`
3. Put `Ty->ty_self` in the cache
4. Convert members of `Ty`. This may cause `ConvertRecordType` to be
called again with other types.
5. Create final `DICompositeTypeAttr`
6. Replace the `ty_self` in the cache with one created in step 5 end


The purpose of creating `ty_self` is to handle cases where a member may
have reference to parent type.

Now consider the code below:

```
type t1
  type(t2), pointer :: p1
end type
type t2
   type(t1), pointer :: p2
end type
```

While processing t1, we could have a structure like below. `t1 -> t2 ->
t1_self`

The `t2` created during handling of `t1` cant be cached on its own as it
contains a place holder reference. It will fail an assert in MLIR if it
is processed standalone. To avoid this problem, we have a check in the
step 6 above to not cache such types. But this check was not tight
enough. It just checked if a type should not have a place holder
reference to another type. It missed the following case where the place
holder reference can be in a type further down the line.

```
type t1
  type(t2), pointer :: p1
end type
type t2
  type(t3), pointer :: p2
end type
type t3
  type(t1), pointer :: p3
end type
```

So while processing `t1`, we have to stop caching of not only `t3` but
also of `t2`. This PR improves the check and moves the logic inside
`convertRecordType`.

Please note that this limitation of why a type cant have a placeholder
reference is because of how such references are resolved in the mlir.
Please see the discussion at the end of this
[PR](#106571).

I have to change `getDerivedType` so that it will also get the derived
type for things like `type(t2), pointer :: p1` which are wrapped in
`BoxType`. Happy to move it to a new function or a local helper in case
this change is problematic.

Fixes #122024.
…123588)

We need to emit the 'initializer of X is not a constant expression' note
for local constexpr variables as well.
This PR allows to lower **unsigned** `tosa.max_pool2d` to linalg.
```
// CHECK-LABEL: @max_pool_ui8
func.func @max_pool_ui8(%arg0: tensor<1x6x34x62xui8>) -> tensor<1x4x32x62xui8> {
  // CHECK: builtin.unrealized_conversion_cast {{.*}} : tensor<1x6x34x62xui8> to tensor<1x6x34x62xi8>
  // CHECK: arith.constant 0
  // CHECK: linalg.pooling_nhwc_max_unsigned {{.*}} : (tensor<1x4x32x62xi8>) -> tensor<1x4x32x62xi8>
  // CHECK: builtin.unrealized_conversion_cast {{.*}} : tensor<1x4x32x62xi8> to tensor<1x4x32x62xui8>
  %0 = tosa.max_pool2d %arg0 {pad = array<i64: 0, 0, 0, 0>, kernel = array<i64: 3, 3>, stride = array<i64: 1, 1>} : (tensor<1x6x34x62xui8>) -> tensor<1x4x32x62xui8>
  return %0 : tensor<1x4x32x62xui8>
}
```
It does this by
- converting the MaxPool2dConverter from OpRewriterPattern to
OpConversion Pattern
- adjusting the padding value to the the minimum unsigned value when the
max_pool is unsigned
- lowering to `linalg.pooling_nhwc_max_unsigned` (which uses
`arith.maxui`) when the max_pool is unsigned
)

Based on feedback from the clastb codegen PR, I'm refactoring basic codegen for the vector.extract.last.active intrinsic to lower to an ISD node in SelectionDAGBuilder then expand in LegalizeVectorOps, instead of doing everything in the builder.

The new ISD node (vector_find_last_active) only covers finding the index of the last active element of the mask, and extracting the element + handling passthru is left to existing ISD nodes.
We are not handling 'S' scalar dependencies correctly and have at least
the following miscompiles related to that:

[LoopInterchange] incorrect handling of scalar dependencies and dependence vectors starting with ">" #54176
[LoopInterchange] Interchange breaks program correctness #46867
[LoopInterchange] Loops should not interchanged due to dependencies #47259
[LoopInterchange] Loops should not interchanged due to control flow #47401

This patch does no longer insert the "S" dependency/direction into the
dependency matrix, so a dependency is never "S". We seem to have
forgotten what the exact meaning is of this dependency type, and don't
see why it should be treated differently.

We prefer correctness over incorrect and more aggressive results. I.e.,
this prevents the miscompiles at the expense of handling less cases,
i.e. making interchange more pessimistic. However, some of the cases
that are now rejected for dependence analysis reasons, were rejected
before too but for other reasons (e.g. profitability). So at least for
the llvm regression tests, the number of regression are very reasonable.
This should be a stopgap. We would like to get interchange enabled by
default and thus prefer correctness over unsafe transforms, and later
see if we can get solve the regressions.
…#120474)

This patch adds several instructions seen when trying to run a
executable built with ASan with llvm-mingw.
(x86 and x86_64, using the git tip in llvm-project).

Also includes instructions collected by
Roman Pišl and Eric Pouech in the Wine bug reports below.

```
Related: #96270

Co-authored-by: Roman Pišl <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=50993
                https://bugs.winehq.org/attachment.cgi?id=70233
Co-authored-by: Eric Pouech <[email protected]>
                https://bugs.winehq.org/show_bug.cgi?id=52386
                https://bugs.winehq.org/attachment.cgi?id=71626
```
This is a follow up to 68a3908 (func: Set default dialect to
'emitc'), but for other instructions with blocks to make it look
consistent.
In https://reviews.llvm.org/D136765 / https://reviews.llvm.org/D144155,
the asan annotations for `std::vector` were modified to unpoison freed
backing memory on destruction, instead of leaving it poisoned. However,
calling `__clear()` instead of `clear()` skips informing the asan runtime
of this decrease in the accessible container size, which breaks the
invariant that the value of `old_mid` should match the value of `new_mid`
from the previous call to `__sanitizer_annotate_contiguous_container`, which
can trip the sanity checks for the partial poison between [d1, d2) and the
container redzone between [d2, c), if enabled. To fix this, ensure that
`clear()` is called instead, as is already done by `__vdeallocate()`.
Also remove `__clear()`, since it is no longer called.
The function getPartialReductionCost is already quite large and
is likely to grow in size as we add support for more cases in
future. Therefore, I think it's best to move this into the cpp
file.
inbelic and others added 30 commits January 21, 2025 15:12
Reverts #122992

Due to an included failing test-case the commit causes build failures.
SBSaveCoreOptions has been around for awhile now, so I decided to draft
up some Docstrings describing the functionality better. Some of my
wording sounded a bit clunky due the optionality of each method call so
I would greatly appreciate feedback.

Includes the new method in #122541 so I will merge this as a follow up.
#123679)

`Generalization.cpp:53`
```cpp
FailureOr<GenericOp> mlir::linalg::generalizeNamedOp(RewriterBase &rewriter,
                                                     LinalgOp linalgOp) {
  if (failed(generalizeNamedOpPrecondition(linalgOp)))
    return rewriter.notifyMatchFailure(linalgOp, "preconditions not met");

  SmallVector<Value> inputs = linalgOp.getDpsInputs();
  ValueRange outputs = linalgOp.getDpsInits();
  SmallVector<AffineMap> indexingMaps = linalgOp.getIndexingMapsArray();
  SmallVector<utils::IteratorType> iterators = linalgOp.getIteratorTypesArray();
  SmallVector<Type> resultTypes = linalgOp.hasPureTensorSemantics()
                                      ? TypeRange(ValueRange(outputs))
                                      : TypeRange{};
...
```

`generalizeNamedOp` in `Generalization.cpp` has a different arg name
than `generalizeNamedOp` in `Transforms.h`

Sync to use `linalgOp`
This patch implements the directive #pragma clang section on COFF targets
with the exact same features available on ELF and Mach-O.
Today, emitLinkerDirectives is private to TLOFCOFF-- it isolates parsing
and processing of the linker options. Similar processing is also done by
other TLOFs inline within emitModuleMetadata. This patch promotes
emitLinkerDirectives to a virtual (public) method so that this handling
is similarly isolated in the other TLOFs.

This also enables downstream targets to override just this handling
instead of the whole of emitModuleMetadata.
#123235)

.gfids$y contains a list of indirect calls for Control Flow Guard. This
wasn't working properly for ARM64EC: direct calls were being treated as
indirect calls. Make sure we correctly filter out direct calls.

This improves the protection from Control Flow Guard, and also fixes a
link error when using certain functions from oldnames.lib.
…torToLLVMPass (#123491)

This flag enables the configuration of some transformation such as the
lowering of contractions and transposes. The default configuration
preserves the existing behavior.
…123826)

Recently I added SBProgress (#119052), and during that original commit I
tested if the progress event was sent over LLDB-DAP, and it was. However
upon the suggestion of @JDevlieghere and @labath we added an external
category (#120171), which I did not test.

This small patch wires up DAP to listen for external events by default,
and adds the external category to the SBDebugger enumeration.
…123008)

This cleanup action only needs to be performed once when the entire
optimization is converged. Doing it in every iteration has a very high
time-complexity, as it queries every dbg value in a dense map

Compare before and after for one internal source file with many basic
blocks


![image](https://github.com/user-attachments/assets/1dac76a9-a974-4068-9aa1-4041f963fa8e)

![image](https://github.com/user-attachments/assets/73ea2ef1-d1f4-4064-8826-8c13fb539b8d)

>90% reduction in this extreme case.
This patch adds support for replacing calls to
`__hipstdpar_hidden_malloc` with calls to `__libc_malloc`, similarly to
how we handle hidden `free`. A future paired change in the forwarding
header will leverage this capability in order to handle certain special
cases where it is not possible / desirable to allocate via the HIP
runtime.
- Redefines `DXILAttribute` to denote a function attribute, compatible
to how it was define in DXC/LLVM 3.7
- Fix how `DXILAttribute` is emitted to be a struct of set attributes
instead of an "or" of the enums
- Implement the lowering of `DXILAttribute` to LLVM function attributes
in `DXILOpBuilder.cpp`. A custom mapping is defined.
- Audit all current ops to specify the correct attributes consistent
with DXC. This is done here to allow for testing.
- Update testcases in `llvm/test/CodeGen/DirectX` of all ops with
attributes to match that attributes are set
- Update testcases of ops that had previously incorrectly set attributes
to check there is no attributes set
- Defines `DXILProperty` to denote the other type of attributes from DXC
used to query properties.
- Emit `DXILProperty` as a struct of set attributes.
- Updates `DXIL.td` to specify applicable `DXILProperty`s on ops

Note: `DXILProperty` was referred to as 'queryable attributes' in design
discussion. Changed to property to allow for better expression in
`DXIL.td`

Resolves #114461
Resolves #115912
This patch makes the build container job save the agent container image to a
separate tar file rather than bundling it in with the existing tar file. For
some reason, running podman save with two container images and then loading
that single tar file gets rid of the agent image and we end up with two
copies of the original image. This means that premerge jobs will fail with
the agent image because they cannot find the run.sh script.
… others. NFC

Tablegen is missing a check that should have caught that these
were duplicated. I'm working to restore that check.
It used to early return when destination is same as origin. But it's redundant
because in that case the callback won't get called in the first place.
Add register class bit width for SReg_256_XNULL and SReg_128_XNULL
Debugging the dropped variable statistics for large LLVM IR files can be
made easier if the functions called before an optimization pass is run
includes the PassID of the pass that will be run after statistics
metrics are collected. This patch adds that support.
This reverts commit 106f105.

Everything should be in working order now that the container job has been
updated to work properly. This has been tested on an individual job.
This commit adds support for linker script unary plus ('+') operator. It
is helpful for improving compatibility between LLD and GNU LD.

Closes #118047
add commute for some VOP3 inst, allow commute for both inline constant
operand, adjust tests

Fixes #111205
This patch fixes:

  llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2792:14: error: comparison of
  integers of different signs: 'unsigned int' and 'int'
  [-Werror,-Wsign-compare]

  llvm/lib/Target/AMDGPU/SIInstrInfo.cpp:2797:14: error: comparison of
  integers of different signs: 'unsigned int' and 'int'
  [-Werror,-Wsign-compare]
…#123858)

See https://cmake.org/cmake/help/latest/policy/CMP0175.html

> The `OUTPUT` form does not accept `PRE_BUILD`, `PRE_LINK`, or
`POST_BUILD` keywords.

When using CMake version 3.31+, this results in ~2000 lines of warning
spam in my downstream project:

```
CMake Warning (dev) at build-gcc/lib/cmake/mlir/AddMLIRPython.cmake:606 (add_custom_command):
  The following keywords are not supported when using
  add_custom_command(OUTPUT): PRE_BUILD.

  Policy CMP0175 is not set: add_custom_command() rejects invalid arguments.
  Run "cmake --help-policy CMP0175" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
Call Stack (most recent call first):
  build-gcc/lib/cmake/mlir/AddMLIRPython.cmake:222 (add_mlir_python_sources_target)
  build-gcc/lib/cmake/mlir/AddMLIRPython.cmake:256 (_process_target)
  compiler/bindings/python/CMakeLists.txt:239 (add_mlir_python_modules)
This warning is for project developers.  Use -Wno-dev to suppress it.
```

General docs:
https://cmake.org/cmake/help/latest/command/add_custom_command.html.
Note that `PRE_BUILD` only appears in the _second_ signature for the
function (which takes `TARGET`) not the first (which takes `OUTPUT`).
…e instantiation definition (#123871)

Close #123719

The reason is, we thought the external explicit template instantiation
declaration as the external definition incorrectly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment