forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] main from llvm:main #116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading status checks…
…()` (#134887) This PR extracts the creation of `CompilerInstance` for compiling an implicitly-discovered module out of `compileModuleImpl()` into its own separate function and passes it into `compileModuleImpl()` from the outside. This makes the instance creation logic reusable (useful for my experiments) and also simplifies the API, removing the `PreBuildStep` and `PostBuildStep` hooks from `compileModuleImpl()`.
Loading status checks…
Loading status checks…
… LoopDeletion (#134906) Fixes the case where subsequent passes were unable to find and delete the invariant loop left over by the strlen idiom conversion. Since `loop-deletion` only operate on computable loops, we can update the loop condition to something more easily picked up by `loop-deletion` As pointed out in #134736
Loading status checks…
`Sema::getCurFunctionDecl(AllowLambda = false)` returns a nullptr when the lambda declaration is outside a function (for example, when assigning a lambda to a static constexpr variable). This triggered an assertion in `SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall`. Using `Sema::getCurFunctionDecl(AllowLambda = true)` returns the declaration of the enclosing lambda. Stumbled with this issue when refactoring some code in CK.
Loading status checks…
…own (#135314) Relax the restriction for init and shutdown directives for device_type clause. The clause can be allowed multiple times.
This patch adds `VisitBinAssign` and `VisitBinComma` to the ClangIR `ScalarExprEmitter` to enable assignments and the comma operator. --------- Co-authored-by: Morris Hafner <[email protected]>
Loading status checks…
…s, NFC.
This reverts commit 2fd860c as this is causing a EXC_BAD_ACCESS on Darwin: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/23807/ https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/11255/
The calleeDecl var will be used in the near future, so I left it. At least for clang, the [[maybe_unused]] attribute takes care of the warnings related to that variable. The other warning was a simple lack of return after errorNYI.
Loading status checks…
…lysis Need to use the original scalar type, when building the reduction, and use the scalar type, when performing casting, to avoid compiler crash.
Loading status checks…
…checks.py (#134327) A few additions: - Lines with `{{`: These can show up if serializing non-MLIR info into string attrs `my.attr = {{proto}, {...}}`. String escape the opening `{{`, given that check lines are generated this has no effect on `{{.*}}` etc in generated lines. - File split line: Normally these are skipped because of their indent level, but if using `--starts_from_scope=0` to generate checks for the `module {...} {` line, and since MLIR opt tools emit file split lines by default, some `CHECK: // -----` lines were emit. - (edit removed this, fixed by #134364) AttrAliases: I'm not sure if I'm missing something for the attribute parser to work correctly, but I was getting many `#[[?]]` for all dialect attrs. Only use the attr aliasing if there's a match.
Like #135202 this fixes another issue after the XCode update.
Loading status checks…
… analysis, NFC
Loading status checks…
This was on request from other maintainers, given that I've been de-facto acting as maintainer of the baremetal allocator stuff.
Loading status checks…
At the moment the ftm macro for __cpp_lib_to_chars will have the following values: standard_ftms: { "c++17": "201611L", "c++20": "201611L", "c++23": "201611L", "c++26": "201611L", } implemented_ftms: { "c++17": None, } This is an issue with the test whether the FTM is implemented it does: self.implemented_ftms[ftm][std] == self.standard_ftms[ftm][std] This will fail in C++20 since implemented_ftms[ftm] does not have the key c++20. This adds a new helper function and removes the None entries when a FTM is not implemented. --------- Co-authored-by: Louis Dionne <[email protected]>
fixes #135285 This change implements the `usub.sat` intrinsic to perform an unsigned saturating subtraction on the 2 arguments. The minimum value this operation is clamp to is 0.
Apparently we used the 'end location' instead of 'start' in a few places.
Loading status checks…
Loading status checks…
If the buildvector node has cast to float user, it cannot be considered as safe for truncation, need to use the original bitwidth here. Fixes #135410
Loading status checks…
…:Clear (#134397)" (#135296) This reapplies commit 232525f. The original commit triggered a sanitizer failure when `Target` was destroyed. In `Target::Destroy`, `DeleteCurrentProcess` was called, but it did not destroy the thread creation breakpoints for the underlying `ProcessGDBRemote` because `ProcessGDBRemote::Clear` was not called in that path. `Target `then proceeded to destroy its breakpoints, which resulted in a call to the destructor of a `std::vector` containing the breakpoints. Through a sequence of complicated events, destroying breakpoints caused the reference count of the underlying `ProcessGDBRemote` to finally reach zero. This, in turn, called `ProcessGDBRemote::Clear`, which attempted to destroy the breakpoints. To do that, it would go back into the Target's vector of breakpoints, which we are in the middle of destroying. We solve this by moving the breakpoint deletion into `Process:DoDestroy`, which is a virtual Process method that will be called much earlier.
Loading status checks…
This change refines the verifier for `vector.load` and `vector.store` to disallow the use of vectors with higher rank than the source or destination memref. For example, the following is now rejected: ```mlir %0 = vector.load %src[%c0] : memref<?xi8>, vector<16x16xi8> vector.store %vec, %dest[%c0] : memref<?xi8>, vector<16x16xi8> ``` This pattern was previously used in SME end-to-end tests and "happened" to work by implicitly assuming row-major memory layout. However, there is no guarantee that such an assumption will always hold, and we should avoid relying on it unless it can be enforced deterministically. Notably, production ArmSME lowering pipelines do not rely on this behavior. Instead, the expected usage (illustrated here with scalable vector syntax) would be: ```mlir %0 = vector.load %src[%c0, %c0] : memref<?x?xi8>, vector<[16]x[16]xi8> ``` This PR updates the verifier accordingly and adjusts all affected tests. These tests are either removed (if no longer relevant) or updated to use memrefs with appropriately matching rank.
Loading status checks…
This is a follow on to #130946 to use the same codesize cost override in getScalarizationOverhead for vector instructions.
Fixes #118879
Loading status checks…
… verifier (#134910)
Loading status checks…
…ut (#135417) My recent change to speed up formatted integer input has a bug on big-endian targets that has shown up on ppc64 AIX build bots. Fix.
Loading status checks…
Loading status checks…
…34476) In HIP, the Clang driver already sets `force-import-all` when ThinLTO is enabled. As a result, all imported functions get the `available_externally` linkage. However, these functions are later removed by the `EliminateAvailableExternallyPass`, effectively undoing the forced import and eventually leading to link errors. The `EliminateAvailableExternallyPass` provides an option to convert `available_externally` functions into local functions, renaming them to avoid conflicts. This behavior is exactly what we need for HIP. This PR enables that option (`avail-extern-to-local`) alongside `force-import-all` when ThinLTO is used. With this change, ThinLTO almost works correctly on AMDGPU. The only remaining issue is an undefined reference to `__assert_fail`, but that falls outside the scope of this PR.
The current module file documentation antedates the current implementation of module files and contains many aspirational and conditional statements, all of which can now be resolved with descriptions of how things actually work.
Loading status checks…
…5406) Recent work to better handle macro replacement in literal constant kind suffixes isn't handling fixed form well, leading to a crash in Fujitsu test 0113/0113_0073.F. The look-ahead needs to be done with the higher-level prescanner functions that skip over fixed form comment fields after column 72. Rework.
Loading status checks…
This patch extends the canonicalization printing policy to cover expressions and template names, and wires that up to the template argument printer, covering expressions, and to the expression within a dependent decltype. This is helpful for debugging, or if these expressions somehow end up in diagnostics, as without this patch they can print as completely unrelated expressions, which can be quite confusing. This is because expressions are not uniqued, unlike types, and when a template specialization containing an expression is the first to be canonicalized, the expression ends up appearing in the canonical type of subsequent equivalent specializations. Fixes #92292
Loading status checks…
…135416) The logic for fixed form compiler directive line continuation has a hole that can apply continuation for !$ even if the next line does not begin with a fixed form comment character. Rearrange the nested if statements to enforce that requirement for all compiler directives.
Loading status checks…
…5426) Nearly, but not all, other compilers have a blanket prohibition against the use of an INTENT(OUT) dummy argument in a specification expression. Some compilers, however, permit an INTENT(OUT) dummy argument to appear in a specification expression in a BLOCK construct or inner procedure via host association. The argument some have put forth to accept this usage comes from a reading of 10.1.11 (specification expressions) in Fortran 2023 that, if followed consistently, would also require host-associated OPTIONAL dummy argument to be allowed. That would be dangerous for reasons that should be obvious. However, I can agree that a non-OPTIONAL dummy argument can't be assumed to remain undefined on entry to a BLOCK construct or inner procedure, so we can accept host-associated INTENT(OUT) in specification expressions with a portability warning.
Loading status checks…
- Change various Inst/Asm Printer functions to use a StringRef for the Modifier parameter (instead of a const char *). - This simplifies various string comparisons used within these functions. - Remove these params for print functions that do not use them.
Loading status checks…
…ured comments (#132744) Fixes issue #132739. CaptureAnalysis only considers captures through the def-use chain of the provided pointer, explicitly excluding captures of underlying values or implicit captures like those involving external globals. The previous comment for `PointerMayBeCaptured` did not clearly state this limitation, leading to its incorrect usage in files such as ThreadSanitizer.cpp and SanitizerMetadata.cpp. This PR addresses this by refining the comments for the relevant APIs within `PointerMayBeCaptured` to explicitly document this behavior.
Loading status checks…
OpenACC declare statements are restricted from having having clauses that reference assumed size arrays. It should be the case that we can implement `deviceptr` and `present` clauses for assumed-size arrays. This is a first step towards relaxing this restriction. Note running flang on the following example results in an error in lowering. ``` $ cat t.f90 subroutine vadd (a, b, c, n) real(8) :: a(*), b(*), c(*) !$acc declare deviceptr(a, b, c) !$acc parallel loop do i = 1,n c(i) = a(i) + b(i) enddo end subroutine $ flang -fopenacc -c t.f90 error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":3:7): expect declare attribute on variable in declare operation error: Lowering to LLVM IR failed error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): unsupported OpenACC operation: acc.private.recipe error: loc("/home/akuhlenschmi/work/p4/ta/tests/openacc/src/t.f90":4:7): LLVM Translation failed for operation: acc.private.recipe error: failed to create the LLVM module ``` I would like to to share this code, because others are currently working on the implementation of `deviceptr`, but it is obviously not running end-to-end. I think the cleanest approach to this would be to put this exception to the rule behind some feature flag, but I am not certain what the precedence for that is.
Create useful helper functions for UEFI 64 bit target that can be used in tablegen files in future changes.
Loading status checks…
- Remove calls to pass initialization from pass constructors. - #111767
Loading status checks…
…DEPEND (#133892) The OpenMP runtime needs the base address of the array section to identify the dependency. If we just put the vector subscript through the usual HLFIR expression lowering, that would generate a new contiguous array representing the values of the elements in the array which was sectioned. We cannot use addresses from this array because these addresses would not match dependencies on the original array. For example ``` integer :: array(1024) integer :: indices(2) indices(1) = 1 indices(2) = 100 !$omp task depend(out: array(1:512)) !$omp end task !$omp task depend(in: array(indices)) !$omp end task ``` This requires taking the lowering path previously only used for ordered assignments to get the address of the elements in the original array which were indexed. This is done using `hlfir.elemental_addr`. e.g. ``` array(indices) = 2 ``` `hlfir.elemental_addr` is awkward to use because it (by design) doesn't return something like `!hlfir.expr<>` (like `hlfir.elemental`) and so it can't have a generic lowering: each place it is used has to carefully inline the contents of the operation and extract the needed address. For this reason, `hlfir.elemental_addr` is not allowed outside of these ordered assignments. In this commit I ignore this restriction so that I can use `hlfir.elemental_addr` to lower the OpenMP DEPEND clause (this works because the operation is inlined and removed before the verifier runs). One alternative solution would have been to provide my own more limited re-implementation of `HlfirDesignatorBuilder` which skipped `hlfir::elemental_addr`, instead inlining its body directly at the current insertion point applying indices only for the first element. This would have been difficult to maintain because designation in Fortran is complex.
Loading status checks…
* Minor variable name alignment Signed-off-by: Jerry Ge <[email protected]>
* simple example variable name alignment Signed-off-by: Jerry Ge <[email protected]>
* Minor example variable name alignment Signed-off-by: Jerry Ge <[email protected]>
Loading status checks…
This makes __libcpp_verbose_abort unconditionally noexcept. This was planned for the upcomming release.
Loading status checks…
See #135401 for the full flakiness report. It fails on stage2/asan_ubsan check with: ``` + /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation JIT session error: In graph incr_module_23-jitted-objectbuffer, section .text.startup: relocation target "_ZN1AD2Ev" at address 0x79618c48d040 is out of range of Delta32 fixup at 0x75618b40d02d (<anonymous block> @ 0x75618b40d010 + 0x1d) error: Failed to materialize symbols: { (main, { $.incr_module_23.__inits.0, __orc_init_func.incr_module_23, a2 }) } error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_23 }) } The error message ("out of range of Delta32") appears similar to #102858, another Interpreter test that is flaky with ASan. ``` Recent test history on the x86_64-linux-fast bot: - https://lab.llvm.org/buildbot/#/builders/169/builds/10339: fail - 10340: buildbot logistical problem - https://lab.llvm.org/buildbot/#/builders/169/builds/10341: fail - https://lab.llvm.org/buildbot/#/builders/169/builds/10342: fail - 10343: pass - 10344: pass - https://lab.llvm.org/buildbot/#/builders/169/builds/10345: fail - 10346: pass ...
This enum was not fully specified.
This version is the current version, to avoid unplanned automatic updates in the future pin this version.
Previously `vst` and `vl` were not considered "simple" BDX stores and loads, leading to, among other things, some opportunities for `mvc` optimization to be missed. This PR addresses this and updates some tests to account for additional `mvc` instructions being emitted. This is observed to have a neutral or slightly beneficial effect performance-wise.
Functions with `musttail` calls can't be roots because we can't instrument their `ret` to release the context. This patch tags their `CtxRoot` field in their `FunctionData`. In compiler-rt we then know not to allow such functions become roots, and also not confuse `CtxRoot == 0x1` with there being a context root. Currently we also lose the context tree under such cases. We can, in a subsequent patch, have the root detector search past these functions.
Loading status checks…
AsmPrinter may switch the current section when e.g., emitting a jump table for a switch. `.stack_sizes` should still be linked to the function section. If the section is wrong, readelf emits a warning "relocation symbol is not in the expected section".
Fix regression to MLIR dylib support introduced in #135240. Without the fix, the build with no static libraries fails: ``` FAILED: bin/fir-opt : && /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -O2 -pipe -march=native -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wno-deprecated-copy -Wno-ctad-maybe-unsupported -fno-strict-aliasing -fno-semantic-interposition -Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs -Wl,-rpath-link,/var/tmp/portage/llvm-core/flang-21.0.0.9999/work/flang_build/lib -Wl,--gc-sections -Wl,--dependency-file=tools/fir-opt/CMakeFiles/fir-opt.dir/link.d tools/fir-opt/CMakeFiles/fir-opt.dir/fir-opt.cpp.o -o bin/fir-opt -L/usr/lib/llvm/21/lib64 -Wl,-rpath,"\$ORIGIN/../lib64:/usr/lib/llvm/21/lib64:" lib/libCUFAttrs.a lib/libCUFDialect.a lib/libFIRDialect.a lib/libFIRSupport.a lib/libFIRTransforms.a lib/libFIRCodeGen.a lib/libFIRCodeGenDialect.a lib/libHLFIRDialect.a lib/libHLFIRTransforms.a lib/libFIROpenACCSupport.a lib/libFIROpenMPSupport.a lib/libFlangOpenMPTransforms.a lib/libFIRAnalysis.a lib/libFIRTransforms.a lib/libFIRCodeGen.a lib/libFIROpenACCSupport.a lib/libFIRCodeGenDialect.a -lMLIRIR lib/libFIROpenMPSupport.a lib/libFIRAnalysis.a lib/libFIRBuilder.a lib/libCUFDialect.a lib/libFIRSupport.a lib/libHLFIRDialect.a lib/libFIRDialect.a lib/libCUFAttrs.a lib/libFIRDialectSupport.a lib/libFortranEvaluate.a lib/libFortranDecimal.a lib/libFortranParser.a lib/libFortranSupport.a /usr/lib/llvm/21/lib64/libMLIR.so.21.0gitfa4ac19f /usr/lib/llvm/21/lib64/libclang-cpp.so.21.0gitfa4ac19f /usr/lib/llvm/21/lib64/libLLVM.so.21.0gitfa4ac19f -lquadmath && : /usr/lib/gcc/x86_64-pc-linux-gnu/14/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lMLIRIR: No such file or directory collect2: error: ld returned 1 exit status ```
To handle relative vftable, which is enabled with clang option `-fexperimental-relative-c++-abi-vtables`, we look for PC relative relocations whose fixup locations fall in vtable address ranges. For such relocations, actual target is just virtual function itself, and the addend is to record the distance between vtable slot for target virtual function and the first virtual function slot in vtable, which is to match generated code that calls virtual function. So we can skip the logic of handling "function + offset" and directly save such relocations for future fixup after new layout is known.
Loading status checks…
…e issue 135612. (#135629)
Loading status checks…
The `dx.dot2`, `dot3`, and `dot4` intrinsics exist purely to lower `dx.fdot`, and they map exactly to the DXIL ops of the same name. Using vectors for their arguments adds unnecessary complexity and causes us to have vector operations that are not trivial to lower post-scalarizer. Similarly, the `dx.dot2add` intrinsic is overly generic for something that only needs to lower to a single `dot2AddHalf` DXIL op. Update its signature to match the operation it lowers to. Fixes #134569.
Loading status checks…
Hi! As the title says, this PR adds support for >2 arguments in `shape.broadcast` folder by sequentially calling `getBroadcastedShape`.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.1)
Can you help keep this open source service alive? 💖 Please sponsor : )