Skip to content

[pull] main from llvm:main #116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5,630 commits into from
Apr 14, 2025
Merged

[pull] main from llvm:main #116

merged 5,630 commits into from
Apr 14, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented Feb 25, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot added the ⤵️ pull label Feb 25, 2025
jansvoboda11 and others added 29 commits April 11, 2025 09:39

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…()` (#134887)

This PR extracts the creation of `CompilerInstance` for compiling an
implicitly-discovered module out of `compileModuleImpl()` into its own
separate function and passes it into `compileModuleImpl()` from the
outside. This makes the instance creation logic reusable (useful for my
experiments) and also simplifies the API, removing the `PreBuildStep`
and `PostBuildStep` hooks from `compileModuleImpl()`.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…NFC) (#135094)

With #135093, we may just use `symtab` instead.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
… LoopDeletion (#134906)

Fixes the case where subsequent passes were unable to find and delete
the invariant loop left over by the strlen idiom conversion. Since
`loop-deletion` only operate on computable loops, we can update the loop
condition to something more easily picked up by `loop-deletion`

As pointed out in #134736

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
`Sema::getCurFunctionDecl(AllowLambda = false)` returns a nullptr when
the lambda declaration is outside a function (for example, when
assigning a lambda to a static constexpr variable).

This triggered an assertion in
`SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall`.

Using `Sema::getCurFunctionDecl(AllowLambda = true)` returns the
declaration of the enclosing lambda.

Stumbled with this issue when refactoring some code in CK.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…own (#135314)

Relax the restriction for init and shutdown directives for device_type
clause. The clause can be allowed multiple times.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This patch adds `VisitBinAssign` and `VisitBinComma` to the ClangIR
`ScalarExprEmitter` to enable assignments and the comma operator.

---------

Co-authored-by: Morris Hafner <[email protected]>

Verified

This commit was signed with the committer’s verified signature.
JDevlieghere Jonas Devlieghere

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
The calleeDecl var will be used in the near future, so I left it. At
least for clang, the [[maybe_unused]] attribute takes care of the
warnings related to that variable. The other warning was a simple lack
of return after errorNYI.
…lysis

Need to use the original scalar type, when building the reduction, and
use the scalar type, when performing casting, to avoid compiler crash.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…checks.py (#134327)

A few additions:

- Lines with `{{`: These can show up if serializing non-MLIR info into
string attrs `my.attr = {{proto}, {...}}`. String escape the opening
`{{`, given that check lines are generated this has no effect on
`{{.*}}` etc in generated lines.
- File split line: Normally these are skipped because of their indent
level, but if using `--starts_from_scope=0` to generate checks for the
`module {...} {` line, and since MLIR opt tools emit file split lines by
default, some `CHECK: // -----` lines were emit.
- (edit removed this, fixed by
#134364) AttrAliases: I'm not
sure if I'm missing something for the attribute parser to work
correctly, but I was getting many `#[[?]]` for all dialect attrs. Only
use the attr aliasing if there's a match.
Like #135202 this fixes another issue after the XCode update.
… analysis, NFC

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This was on request from other maintainers, given that I've been
de-facto acting as maintainer of the baremetal allocator stuff.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
At the moment the ftm macro for __cpp_lib_to_chars will have the
following values:

standard_ftms: {
    "c++17": "201611L",
    "c++20": "201611L",
    "c++23": "201611L",
    "c++26": "201611L",
}

implemented_ftms: {
    "c++17": None,
}

This is an issue with the test whether the FTM is implemented it does:
  self.implemented_ftms[ftm][std] == self.standard_ftms[ftm][std]
This will fail in C++20 since implemented_ftms[ftm] does not have the
key c++20. This adds a new helper function and removes the None entries
when a FTM is not implemented.

---------

Co-authored-by: Louis Dionne <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fixes #135285

This change implements the `usub.sat` intrinsic to perform an unsigned
saturating subtraction on the 2 arguments.
The minimum value this operation is clamp to is 0.
Apparently we used the 'end location' instead of 'start' in a few
places.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
If the buildvector node has cast to float user, it cannot be considered as safe
for truncation, need to use the original bitwidth here.

Fixes #135410

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…:Clear (#134397)" (#135296)

This reapplies commit
232525f.

The original commit triggered a sanitizer failure when `Target` was
destroyed. In `Target::Destroy`, `DeleteCurrentProcess` was called, but
it did not destroy the thread creation breakpoints for the underlying
`ProcessGDBRemote` because `ProcessGDBRemote::Clear` was not called in
that path.

`Target `then proceeded to destroy its breakpoints, which resulted in a
call to the destructor of a `std::vector` containing the breakpoints.
Through a sequence of complicated events, destroying breakpoints caused
the reference count of the underlying `ProcessGDBRemote` to finally
reach zero. This, in turn, called `ProcessGDBRemote::Clear`, which
attempted to destroy the breakpoints. To do that, it would go back into
the Target's vector of breakpoints, which we are in the middle of
destroying.

We solve this by moving the breakpoint deletion into
`Process:DoDestroy`, which is a virtual Process method that will be
called much earlier.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…mbol context (#134757)" (#135408)

This reverts commit e84a804 because on
Linux there seems to be a race around GetRunLock. See #134757 for more
context.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This change refines the verifier for `vector.load` and `vector.store` to
disallow the use of vectors with higher rank than the source or
destination memref. For example, the following is now rejected:

```mlir
  %0 = vector.load %src[%c0] : memref<?xi8>, vector<16x16xi8>
  vector.store %vec, %dest[%c0] : memref<?xi8>, vector<16x16xi8>
```

This pattern was previously used in SME end-to-end tests and "happened"
to work by implicitly assuming row-major memory layout. However, there
is no guarantee that such an assumption will always hold, and we should
avoid relying on it unless it can be enforced deterministically.

Notably, production ArmSME lowering pipelines do not rely on this
behavior. Instead, the expected usage (illustrated here with scalable
vector syntax) would be:

```mlir
  %0 = vector.load %src[%c0, %c0] : memref<?x?xi8>, vector<[16]x[16]xi8>
```

This PR updates the verifier accordingly and adjusts all affected tests.
These tests are either removed (if no longer relevant) or updated to use
memrefs with appropriately matching rank.
This is a follow on to #130946 to use the same codesize cost override in
getScalarizationOverhead for vector instructions.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fixes #118879

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
… verifier (#134910)

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…ut (#135417)

My recent change to speed up formatted integer input has a bug on
big-endian targets that has shown up on ppc64 AIX build bots. Fix.

Verified

This commit was signed with the committer’s verified signature.
fhahn Florian Hahn
llvmgnsyncbot and others added 29 commits April 14, 2025 15:31

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…34476)

In HIP, the Clang driver already sets `force-import-all` when ThinLTO is
enabled. As a result, all imported functions get the
`available_externally`
linkage. However, these functions are later removed by the
`EliminateAvailableExternallyPass`, effectively undoing the forced
import and
eventually leading to link errors.

The `EliminateAvailableExternallyPass` provides an option to convert
`available_externally` functions into local functions, renaming them to
avoid
conflicts. This behavior is exactly what we need for HIP. This PR
enables that
option (`avail-extern-to-local`) alongside `force-import-all` when
ThinLTO is
used.

With this change, ThinLTO almost works correctly on AMDGPU. The only
remaining
issue is an undefined reference to `__assert_fail`, but that falls
outside the
scope of this PR.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
The current module file documentation antedates the current
implementation of module files and contains many aspirational and
conditional statements, all of which can now be resolved with
descriptions of how things actually work.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…5406)

Recent work to better handle macro replacement in literal constant kind
suffixes isn't handling fixed form well, leading to a crash in Fujitsu
test 0113/0113_0073.F. The look-ahead needs to be done with the
higher-level prescanner functions that skip over fixed form comment
fields after column 72. Rework.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This patch extends the canonicalization printing policy to cover
expressions
and template names, and wires that up to the template argument printer,
covering expressions, and to the expression within a dependent decltype.

This is helpful for debugging, or if these expressions somehow end up
in diagnostics, as without this patch they can print as completely
unrelated
expressions, which can be quite confusing.

This is because expressions are not uniqued, unlike types, and
when a template specialization containing an expression is the first to
be
canonicalized, the expression ends up appearing in the canonical type of
subsequent equivalent specializations.

Fixes #92292

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…135416)

The logic for fixed form compiler directive line continuation has a hole
that can apply continuation for !$ even if the next line does not begin
with a fixed form comment character. Rearrange the nested if statements
to enforce that requirement for all compiler directives.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…5426)

Nearly, but not all, other compilers have a blanket prohibition against
the use of an INTENT(OUT) dummy argument in a specification expression.
Some compilers, however, permit an INTENT(OUT) dummy argument to appear
in a specification expression in a BLOCK construct or inner procedure
via host association.

The argument some have put forth to accept this usage comes from a
reading of 10.1.11 (specification expressions) in Fortran 2023 that, if
followed consistently, would also require host-associated OPTIONAL dummy
argument to be allowed. That would be dangerous for reasons that should
be obvious.

However, I can agree that a non-OPTIONAL dummy argument can't be assumed
to remain undefined on entry to a BLOCK construct or inner procedure, so
we can accept host-associated INTENT(OUT) in specification expressions
with a portability warning.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
- Change various Inst/Asm Printer functions to use a StringRef for the
Modifier parameter (instead of a const char *).
- This simplifies various string comparisons used within these
functions.
- Remove these params for print functions that do not use them.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…ured comments (#132744)

Fixes issue #132739.

CaptureAnalysis only considers captures through the def-use chain of the
provided pointer, explicitly excluding captures of underlying values or
implicit captures like those involving external globals.

The previous comment for `PointerMayBeCaptured` did not clearly state
this limitation, leading to its incorrect usage in files such as
ThreadSanitizer.cpp and SanitizerMetadata.cpp.

This PR addresses this by refining the comments for the relevant APIs
within `PointerMayBeCaptured` to explicitly document this behavior.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
OpenACC declare statements are restricted from having having clauses
that reference assumed size arrays. It should be the case that we can
implement `deviceptr` and `present` clauses for assumed-size arrays.
This is a first step towards relaxing this restriction.

Note running flang on the following example results in an error in
lowering.
```
$ cat t.f90
subroutine vadd (a, b, c, n)
   real(8) :: a(*), b(*), c(*)
!$acc declare deviceptr(a, b, c)
!$acc parallel loop
   do i = 1,n
      c(i) = a(i) + b(i)
   enddo
end subroutine

$ flang -fopenacc -c t.f90
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":3:7): expect declare attribute on variable in declare operation
error: Lowering to LLVM IR failed
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): unsupported OpenACC operation: acc.private.recipe
error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): LLVM Translation failed for operation: acc.private.recipe
error: failed to create the LLVM module
```

I would like to to share this code, because others are currently working
on the implementation of `deviceptr`, but it is obviously not running
end-to-end. I think the cleanest approach to this would be to put this
exception to the rule behind some feature flag, but I am not certain
what the precedence for that is.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Create useful helper functions for UEFI 64 bit target that can be used in
tablegen files in future changes.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
- Remove calls to pass initialization from pass constructors.
- #111767

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…DEPEND (#133892)

The OpenMP runtime needs the base address of the array section to
identify the dependency.

If we just put the vector subscript through the usual HLFIR expression
lowering, that would generate a new contiguous array representing the
values of the elements in the array which was sectioned. We cannot use
addresses from this array because these addresses would not match
dependencies on the original array. For example

```
integer :: array(1024)
integer :: indices(2)

indices(1) = 1
indices(2) = 100

!$omp task depend(out: array(1:512))
!$omp end task

!$omp task depend(in: array(indices))
!$omp end task
```

This requires taking the lowering path previously only used for ordered
assignments to get the address of the elements in the original array
which were indexed. This is done using `hlfir.elemental_addr`. e.g.
```
array(indices) = 2
```

`hlfir.elemental_addr` is awkward to use because it (by design) doesn't
return something like `!hlfir.expr<>` (like `hlfir.elemental`) and so it
can't have a generic lowering: each place it is used has to carefully
inline the contents of the operation and extract the needed address.

For this reason, `hlfir.elemental_addr` is not allowed outside of these
ordered assignments. In this commit I ignore this restriction so that I
can use `hlfir.elemental_addr` to lower the OpenMP DEPEND clause (this
works because the operation is inlined and removed before the verifier
runs).

One alternative solution would have been to provide my own more limited
re-implementation of `HlfirDesignatorBuilder` which skipped
`hlfir::elemental_addr`, instead inlining its body directly at the
current insertion point applying indices only for the first element.
This would have been difficult to maintain because designation in
Fortran is complex.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* Minor variable name alignment

Signed-off-by: Jerry Ge <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* simple example variable name alignment

Signed-off-by: Jerry Ge <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* Minor example variable name alignment

Signed-off-by: Jerry Ge <[email protected]>

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This makes __libcpp_verbose_abort unconditionally noexcept. This was
planned for the upcomming release.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
See #135401 for the full flakiness report.

It fails on stage2/asan_ubsan check with:
```
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation
JIT session error: In graph incr_module_23-jitted-objectbuffer, section .text.startup: relocation target "_ZN1AD2Ev" at address 0x79618c48d040 is out of range of Delta32 fixup at 0x75618b40d02d (<anonymous block> @ 0x75618b40d010 + 0x1d)
error: Failed to materialize symbols: { (main, { $.incr_module_23.__inits.0, __orc_init_func.incr_module_23, a2 }) }
error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_23 }) }
The error message ("out of range of Delta32") appears similar to #102858, another Interpreter test that is flaky with ASan.
```

Recent test history on the x86_64-linux-fast bot:

- https://lab.llvm.org/buildbot/#/builders/169/builds/10339: fail
- 10340: buildbot logistical problem
- https://lab.llvm.org/buildbot/#/builders/169/builds/10341: fail
- https://lab.llvm.org/buildbot/#/builders/169/builds/10342: fail
- 10343: pass
- 10344: pass
- https://lab.llvm.org/buildbot/#/builders/169/builds/10345: fail
- 10346: pass
...

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This enum was not fully specified.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
This version is the current version, to avoid unplanned automatic
updates in the future pin this version.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Previously `vst` and `vl` were not considered "simple" BDX stores and
loads, leading to, among other things, some opportunities for `mvc`
optimization to be missed.

This PR addresses this and updates some tests to account for additional
`mvc` instructions being emitted.

This is observed to have a neutral or slightly beneficial effect
performance-wise.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Functions with `musttail` calls can't be roots because we can't instrument their `ret` to release the context. This patch tags their `CtxRoot` field in their `FunctionData`. In compiler-rt we then know not to allow such functions become roots, and also not confuse `CtxRoot == 0x1` with there being a context root.

Currently we also lose the context tree under such cases. We can, in a subsequent patch, have the root detector search past these functions.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
AsmPrinter may switch the current section when e.g., emitting a jump
table for a switch. `.stack_sizes` should still be linked to the
function section. If the section is wrong, readelf emits a warning
"relocation symbol is not in the expected section".

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Fix regression to MLIR dylib support introduced in #135240. Without the
fix, the build with no static libraries fails:

```
FAILED: bin/fir-opt 
: && /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs    -Wl,-rpath-link,/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib  -Wl,--gc-sections  -Wl,--dependency-file=tools/fir-opt/CMakeFiles/fir-opt.dir/link.d tools/fir-opt/CMakeFiles/fir-opt.dir/fir-opt.cpp.o -o bin/fir-opt -L/usr/lib/llvm/21/lib64 -Wl,-rpath,"\$ORIGIN/../lib64:/usr/lib/llvm/21/lib64:"  lib/libCUFAttrs.a  lib/libCUFDialect.a  lib/libFIRDialect.a  lib/libFIRSupport.a  lib/libFIRTransforms.a  lib/libFIRCodeGen.a  lib/libFIRCodeGenDialect.a  lib/libHLFIRDialect.a  lib/libHLFIRTransforms.a  lib/libFIROpenACCSupport.a  lib/libFIROpenMPSupport.a  lib/libFlangOpenMPTransforms.a  lib/libFIRAnalysis.a  lib/libFIRTransforms.a  lib/libFIRCodeGen.a  lib/libFIROpenACCSupport.a  lib/libFIRCodeGenDialect.a  -lMLIRIR  lib/libFIROpenMPSupport.a  lib/libFIRAnalysis.a  lib/libFIRBuilder.a  lib/libCUFDialect.a  lib/libFIRSupport.a  lib/libHLFIRDialect.a  lib/libFIRDialect.a  lib/libCUFAttrs.a  lib/libFIRDialectSupport.a  lib/libFortranEvaluate.a  lib/libFortranDecimal.a  lib/libFortranParser.a  lib/libFortranSupport.a  /usr/lib/llvm/21/lib64/libMLIR.so.21.0gitfa4ac19f  /usr/lib/llvm/21/lib64/libclang-cpp.so.21.0gitfa4ac19f  /usr/lib/llvm/21/lib64/libLLVM.so.21.0gitfa4ac19f  -lquadmath && :
/usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lMLIRIR: No such file or directory
collect2: error: ld returned 1 exit status
```

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
To handle relative vftable, which is enabled with clang option
`-fexperimental-relative-c++-abi-vtables`, we look for PC relative
relocations whose fixup locations fall in vtable address ranges.
For such relocations, actual target is just virtual function itself,
and the addend is to record the distance between vtable slot for
target virtual function and the first virtual function slot in vtable,
which is to match generated code that calls virtual function. So
we can skip the logic of handling "function + offset" and directly
save such relocations for future fixup after new layout is known.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
…e issue 135612. (#135629)

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower
`dx.fdot`, and they map exactly to the DXIL ops of the same name. Using
vectors for their arguments adds unnecessary complexity and causes us to
have vector operations that are not trivial to lower post-scalarizer.

Similarly, the `dx.dot2add` intrinsic is overly generic for something
that only needs to lower to a single `dot2AddHalf` DXIL op. Update its
signature to match the operation it lowers to.

Fixes #134569.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Hi!

As the title says, this PR adds support for >2 arguments in
`shape.broadcast` folder by sequentially calling `getBroadcastedShape`.
@pull pull bot merged commit df84aa8 into optimizecompile:main Apr 14, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment