Skip to content

[LLDB] Don't cache module sp when Activate() fails. #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 125 commits into from

Conversation

thetruestblue
Copy link
Owner

Currently, the instrumentation runtime is caching a library the first time it sees it in the module list. However, in some rare cases on Darwin, the cached pre-run unloaded modules are different from the runtime module that is loaded at runtime. This patch removes the cached module if the plugin fails to activate, ensuring that on subsequent calls we don't try to activate using the unloaded cached module.

There are a few related bugs to fix in a follow up: CheckIfRuntimeValid should have a stronger check to ensure the module is loaded and can be activated. Further investigation in UpdateSpecialBinariesFromNewImageInfos calling ModulesDidLoad when the module list may have unloaded modules.

I have not included a test for the following reasons:

  1. This is an incredibly rare occurance and is only observed in a specific circumstance on Darwin. It is tied to behavior in the DynamicLoader thai is not commonly encountered.

  2. It is difficult to reproduce -- this bug requires precise conditions on darwin and it is unclear how we'd reproduce that in a controlled testing environment.

rdar://128971453

klausler and others added 30 commits June 13, 2024 15:39
llvm#95481)

My recent change that distinguishes pass-by-reference from pass-by-value
reduction operation functions missed the "CppReduceComplex" cases, and
also broke the shared library build-bots. Fix.
Use the packaging [1] module for parsing version numbers, instead of
pkg_resources which is distributed with setuptools. I recently switched
over to using the latter, knowing it was deprecated (in favor of the
packaging module) because it comes with Python out of the box. Newer
versions of setuptools have removed `pkg_resources` so we have to use
packaging.

[1] https://pypi.org/project/packaging/
…vm#95477)

Note that the version of getValueProfDataFromInst that returns bool
has been "deprecated" since:

  commit 1e15371
  Author: Mingming Liu <[email protected]>
  Date:   Mon Apr 1 15:14:49 2024 -0700
This was reverted in llvm#95435
because it broke Android static hwasan binaries. This reland limits the
change to !SANITIZER_ANDROID.

Original commit message:
When set to non-zero, the HWASan runtime will map the shadow base at the
specified constant address.

This is particularly useful in conjunction with the existing compiler
option 'hwasan-mapping-offset', which bakes a hardcoded constant address
into the instrumentation.

---------

Co-authored-by: Vitaly Buka <[email protected]>
…60 (llvm#94004)

lowerInvokeable wasn't updating the returned chain after emitting the
lowerEndEH, which caused SwiftErrorVal-handling code to re-set the DAG
root, and thus accidentally skip the EH_LABEL node it was supposed to
have addeed. After fixing that, a few places needed to be adjusted that
assume the specific shape of the returned DAG.

Fixes: llvm#64826
Fixes: rdar://113994760
…lvm#95275)

MacOS 15.0 and iOS 18.0 added a new sysctl to fetch a bitvector of all
the hw.optional.arm.FEAT_*'s in one go. Using this has a perf advantage
over doing multiple round-trips to the kernel and back, but since it's
not present in older oses, we still need the slow fallback.
…ruction, NFC

And VEXEncoding_* are renamed to OpcodePrefix_*.

This is in preparation for the coming pseudo rex/rex2 prefixes support.
…vector.

The instructions are only defined to operator f16 data. If the
scalar FPR register isn't properly nan-boxed, these instructions
will create a fp16 nan not a bf16 nan in the vector register.
…vm#95485)

Note that the version of getValueProfDataFromInst that returns bool
has been "deprecated" since:

  commit 1e15371
  Author: Mingming Liu <[email protected]>
  Date:   Mon Apr 1 15:14:49 2024 -0700
In MachineBlockPlacement, the function getFirstUnplacedBlock is
inefficient because in most cases (for usual loop CFG), this function
fails to find a candidate, and its complexity becomes O(#(loops in
function) * #(blocks in function)). This makes the compilation of very
long functions slow. This update reduces it to O(k * #(blocks in
function)) where k is the maximum loop nesting depth, by iterating
through the BlockFilter instead.
The bf16 test cases were copied to other files without the Zvfh/Zfvhmin
options. Remove the duplication by adding a few Zvfh command lines to
the bf16 files and deleting the bf16 tests from the test files for f16/f32/f64.
Reverts llvm#95419 and Reland llvm#95358.

This PR is full of temporal fixes. After a discussion with @lntue, it is
better to avoid further changes to the cmake infrastructure for now as a
rework to the cmake utilities will be landed in the future.
…able util functions (llvm#94429)

Also adjusted `LoopParams` to use OpFoldResult instead of Value.
…lvm#95245)

This commit adds support for `gpu.cluster_dim_blocks` and
`gpu.cluster_block_id` Ops to represent number of blocks per cluster and
block id inside a cluster respectively. Also, fixed the description of
`gpu.cluster_dim` Op and updated the `cga_cluster.mlir` test file to use
`gpu.cluster_dim_blocks`

Co-authored-by: pradeepku <[email protected]>
Co-authored-by: Guray Ozen <[email protected]>
We had specific patterns for riscv_vfmv_v_f_vl in both RISCVInstrInfoVVLPatterns.td
and RISCVInstrInfoVSDPatterns.td.

The RISCVInstrInfoVSDPatterns.td patterns could only match if the
RISCVInstrInfoVVLPatterns.td failed. As far as I can tell this
would only happen if the predicate didn't match. Tweak the predicate
so the RISCVInstrInfoVVLPatterns.td can match in more cases.
These patterns are no longer used because we don't generate bf16
to vector splats except for constants that can be handled with
vmerge.vi.
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.

(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)

The reported issue at upstream: llvm#90501
rdar://124035402
This is similar to baremetal printf that was implemented in llvm#94078.
)

We now have baremetal implementations of these entrypoints.
…types out of GetCompleteQualType (llvm#95402)

This patch factors out the completion logic for individual clang::Type's
into their own helper functions.

During the process I cleaned up a few assumptions (e.g., unnecessary
if-guards that could be asserts because these conditions are guaranteed
by the `clang::Type::TypeClass` switch in `GetCompleteQualType`).

This is mainly motivated by the type-completion rework proposed in
llvm#95100.
…m#94221)

PressureDiff is reliable most of the time, and it's pretty much free
compared to RPTracker. We can use it whenever there is no subregister
definitions, or physregs invovled. No subregs because PDiff doesn't take
into account lane liveness, and no Physreg because it seems to get
PhysReg liveness completely wrong. Sometimes it adds a diff, sometimes
itt doesn't - I didn't look at that one for long so maybe there is
something we can eventually do to make it better.

This allows us to save a ton of calls to RPTracker and LIS too. On a
huge IR module (100+MB), it went from about 20M calls to RPTracker in this
function down to 3.4, with the rest being PressureDiffs.

I also added an expensive check to verify correctness of PressureDiff.
…vm#95499)

This aligns Fuchsia targets with other similar OS targets such as
Linux.  Fuchsia's libc already uses unsigned rather than the
compiler-provided __WINT_TYPE__ macro for its wint_t typedef, so
this just makes the compiler consistent with the OS's actual ABI.
The only known manifestation of the mismatch is -Wformat warnings
for %lc no matching wint_t arguments.

The closest thing I could see to existing tests for each target's
wint_t type setting was the predefine tests that check various
macros including __WINT_TYPE__ on a per-machine and/or per-OS
basis.  While the setting is done per-OS in most of the target
implementations rather than actually varying by machine, the only
existing tests for __WINT_TYPE__ are in per-machine checks that
are also wholly or partly tagged as per-OS.  x86_64 and riscv64
tests for respective *-linux-gnu targets now check for the same
definitions in the respective *-fuchsia targets.  __WINT_TYPE__
is not among the type checked in the aarch64 tests and those lack
a section that's specifically tested for aarch64-linux-gnu; if
such is added then it can similarly be made to check for most or
all of the same value on aarch64-fuchsia as aarch64-linux-gnu.
But since the actual implementation of choosing the type is done
per-OS and not per-machine for the three machines with Fuchsia
target support, the x86 and riscv64 tests are already redundantly
testing that same code and seem sufficient.
LoongArch does not yet implement transition from TLSDESC to LE/IE,
so TLSDESC dynamic relocation needs to be generated for each desc,
which is ultimately handled by the dynamic linker.

The test cases reference RISC-V: llvm#79239

Reviewed By: MaskRay, SixWeining

Pull Request: llvm#94451
PiJoules and others added 3 commits June 14, 2024 12:11
This refactors some of the FreeListHeap, FreeList, and Block classes to
have constexpr ctors so we can constinit a global allocator that does
not require running some global function or global ctor to initialize.
This is needed to prevent worrying about initialization order and any
other module-ctor can invoke malloc without worry.
thetruestblue pushed a commit that referenced this pull request Jun 24, 2024
…on (llvm#94752)

Fixes llvm#62925.

The following code:
```cpp
#include <map>

int main() {
   std::map m1 = {std::pair{"foo", 2}, {"bar", 3}}; // guide llvm#2
   std::map m2(m1.begin(), m1.end()); // guide #1
}
```
Is rejected by clang, but accepted by both gcc and msvc:
https://godbolt.org/z/6v4fvabb5 .

So basically CTAD with copy-list-initialization is rejected.

Note that this exact code is also used in a cppreference article:
https://en.cppreference.com/w/cpp/container/map/deduction_guides

I checked the C++11 and C++20 standard drafts to see whether suppressing
user conversion is the correct thing to do for user conversions. Based
on the standard I don't think that it is correct.

```
13.3.1.4 Copy-initialization of class by user-defined conversion [over.match.copy]
Under the conditions specified in 8.5, as part of a copy-initialization of an object of class type, a user-defined
conversion can be invoked to convert an initializer expression to the type of the object being initialized.
Overload resolution is used to select the user-defined conversion to be invoked
```
So we could use user defined conversions according to the standard.

```
If a narrowing conversion is required to initialize any of the elements, the
program is ill-formed.
```
We should not do narrowing.

```
In copy-list-initialization, if an explicit constructor is chosen, the initialization is ill-formed.
```
We should not use explicit constructors.
thetruestblue pushed a commit that referenced this pull request Jun 24, 2024
`rethrow` instruction is a terminator, but when when its DAG is built in
`SelectionDAGBuilder` in a custom routine, it was NOT treated as such.

```ll
rethrow:                                          ; preds = %catch.start
  invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ]
          to label %unreachable unwind label %ehcleanup

ehcleanup:                                        ; preds = %rethrow, %catch.dispatch
  %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ]
  ...
```

In this bitcode, because of the `phi`, a `CONST_I32` will be created in
the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks
like this:
```
  t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
      t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161>
    t6: ch = TokenFactor t3, t5
  t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50>
```
Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so
either can come first in the selected code, which can result in the code
like
```mir
bb.3.rethrow:
  RETHROW 0, implicit-def dead $arguments
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

After this patch, `llvm.wasm.rethrow` is treated as a terminator, and
the DAG will look like
```
        t0: ch,glue = EntryToken
      t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20>
    t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161>
  t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70>
```
Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has
to come after `CopyToReg`. And the resulting code will be
```mir
bb.3.rethrow:
  %9:i32 = CONST_I32 20, implicit-def dead $arguments
  RETHROW 0, implicit-def dead $arguments
  BR %bb.6, implicit-def dead $arguments
```

I'm not very familiar with the internals of `getRoot` vs.
`getControlRoot`, but other terminator instructions seem to use the
latter, and using it for `rethrow` too worked.
vitalybuka pushed a commit that referenced this pull request Dec 11, 2024
…ne symbol size as symbols are created (llvm#117079)"

This reverts commit ba668eb.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame llvm#2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame llvm#2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame llvm#3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame llvm#4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame llvm#5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
vitalybuka pushed a commit that referenced this pull request Dec 11, 2024
## Description

This PR fixes a segmentation fault that occurs when passing options
requiring arguments via `-Xopenmp-target=<triple>`. The issue was that
the function `Driver::getOffloadArchs` did not properly parse the
extracted option, but instead assumed it was valid, leading to a crash
when incomplete arguments were provided.

## Backtrace

```sh
llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: llvm-project/build/bin/clang++ main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu -Xopenmp-target=powerpc64le-ibm-linux-gnu -o
1.      Compilation construction
2.      Building compilation actions
 #0 0x0000562fb21c363b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (llvm-project/build/bin/clang+++0x392f63b)
 #1 0x0000562fb21c0e3c SignalHandler(int) Signals.cpp:0:0
 llvm#2 0x00007fcbf6c81420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 llvm#3 0x0000562fb1fa5d70 llvm::opt::Option::matches(llvm::opt::OptSpecifier) const (llvm-project/build/bin/clang+++0x3711d70)
 llvm#4 0x0000562fb2a78e7d clang::driver::Driver::getOffloadArchs(clang::driver::Compilation&, llvm::opt::DerivedArgList const&, clang::driver::Action::OffloadKind, clang::driver::ToolChain const*, bool) const (llvm-project/build/bin/clang+++0x41e4e7d)
 llvm#5 0x0000562fb2a7a9aa clang::driver::Driver::BuildOffloadingActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, std::pair<clang::driver::types::ID, llvm::opt::Arg const*> const&, clang::driver::Action*) const (.part.1164) Driver.cpp:0:0
 llvm#6 0x0000562fb2a7c093 clang::driver::Driver::BuildActions(clang::driver::Compilation&, llvm::opt::DerivedArgList&, llvm::SmallVector<std::pair<clang::driver::types::ID, llvm::opt::Arg const*>, 16u> const&, llvm::SmallVector<clang::driver::Action*, 3u>&) const (llvm-project/build/bin/clang+++0x41e8093)
 llvm#7 0x0000562fb2a8395d clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) (llvm-project/build/bin/clang+++0x41ef95d)
 llvm#8 0x0000562faf92684c clang_main(int, char**, llvm::ToolContext const&) (llvm-project/build/bin/clang+++0x109284c)
 llvm#9 0x0000562faf826cc6 main (llvm-project/build/bin/clang+++0xf92cc6)
llvm#10 0x00007fcbf6699083 __libc_start_main /build/glibc-LcI20x/glibc-2.31/csu/../csu/libc-start.c:342:3
llvm#11 0x0000562faf923a5e _start (llvm-project/build/bin/clang+++0x108fa5e)
[1]    2628042 segmentation fault (core dumped)   main.cpp -fopenmp=libomp -fopenmp-targets=powerpc64le-ibm-linux-gnu  -o
```
vitalybuka pushed a commit that referenced this pull request Dec 11, 2024
llvm#118923)

…d reentry.

These utilities provide new, more generic and easier to use support for
lazy compilation in ORC.

LazyReexportsManager is an alternative to LazyCallThroughManager. It
takes requests for lazy re-entry points in the form of an alias map:
lazy-reexports = {
  ( <entry point symbol #1>, <implementation symbol #1> ),
  ( <entry point symbol llvm#2>, <implementation symbol llvm#2> ),
  ...
  ( <entry point symbol #n>, <implementation symbol #n> )
}

LazyReexportsManager then:
1. binds the entry points to the implementation names in an internal
table.
2. creates a JIT re-entry trampoline for each entry point.
3. creates a redirectable symbol for each of the entry point name and
binds redirectable symbol to the corresponding reentry trampoline.

When an entry point symbol is first called at runtime (which may be on
any thread of the JIT'd program) it will re-enter the JIT via the
trampoline and trigger a lookup for the implementation symbol stored in
LazyReexportsManager's internal table. When the lookup completes the
entry point symbol will be updated (via the RedirectableSymbolManager)
to point at the implementation symbol, and execution will proceed to the
implementation symbol.

Actual construction of the re-entry trampolines and redirectable symbols
is delegated to an EmitTrampolines functor and the
RedirectableSymbolsManager respectively.

JITLinkReentryTrampolines.h provides a JITLink-based implementation of
the EmitTrampolines functor. (AArch64 only in this patch, but other
architectures will be added in the near future).

Register state save and reentry functionality is added to the ORC
runtime in the __orc_rt_sysv_resolve and __orc_rt_resolve_implementation
functions (the latter is generic, the former will need custom
implementations for each ABI and architecture to be supported, however
this should be much less effort than the existing OrcABISupport
approach, since the ORC runtime allows this code to be written as native
assembly).

The resulting system:
1. Works equally well for in-process and out-of-process JIT'd code.
2. Requires less boilerplate to set up.

Given an ObjectLinkingLayer and PlatformJD (JITDylib containing the ORC
runtime), setup is just:

```c++
auto RSMgr = JITLinkRedirectableSymbolManager::Create(OLL);
if (!RSMgr)
  return RSMgr.takeError();

auto LRMgr = createJITLinkLazyReexportsManager(OLL, **RSMgr, PlatformJD);
if (!LRMgr)
  return LRMgr.takeError();
```

after which lazy reexports can be introduced with:

```c++
JD.define(lazyReexports(LRMgr, <alias map>));
```

LazyObectLinkingLayer is updated to use this new method, but the LLVM-IR
level CompileOnDemandLayer will continue to use LazyCallThroughManager
and OrcABISupport until the new system supports a wider range of
architectures and ABIs.

The llvm-jitlink utility's -lazy option now uses the new scheme. Since
it depends on the ORC runtime, the lazy-link.ll testcase and associated
helpers are moved to the ORC runtime.
thetruestblue pushed a commit that referenced this pull request Jan 6, 2025
According to the documentation described at
https://github.com/loongson/la-abi-specs/blob/release/ladwarf.adoc, the
dwarf numbers for floating-point registers range from 32 to 63.

An incorrect dwarf number will prevent the register values from being
properly restored during unwinding.

This test reflects this problem:

```
loongson@linux:~$ cat test.c

void foo() {
  asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
}
int main() {
  asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
  foo();
  return 0;
}

loongson@linux:~$ clang -g test.c  -o test

```
Without this patch:
```
loongson@linux:~$ ./_build/bin/lldb ./t
(lldb) target create "./t"
Current executable set to
'/home/loongson/llvm-project/_build_lldb/t' (loongarch64).
(lldb) b foo
Breakpoint 1: where = t`foo + 20 at test.c:4:1, address =
0x0000000000000714
(lldb) r
Process 2455626 launched: '/home/loongson/llvm-project/_build_lldb/t' (loongarch64)
Process 2455626 stopped
* thread #1, name = 't', stop reason = breakpoint 1.1
    frame #0: 0x0000555555554714 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) si
Process 2455626 stopped
* thread #1, name = 't', stop reason = instruction step into
    frame #0: 0x0000555555554718 t`foo at test.c:4:1
   1    #include <stdio.h>
   2
   3    void foo() {
-> 4    asm volatile ("movgr2fr.d $fs2, $ra":::"$fs2");
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
(lldb) f 1
frame #1: 0x0000555555554768 t`main at test.c:8:1
   5    }
   6    int main() {
   7    asm volatile ("movgr2fr.d $fs2, $sp":::"$fs2");
-> 8    foo();
   9    return 0;
   10   }
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f13 = 0x00007ffffffef780 !!!!! wrong register
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x0000555555554768  t`main + 40 at test.c:8:1
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
32 registers were unavailable.
```
With this patch:
```
The previous operations are the same.
(lldb) register read -a
General Purpose Registers:
        r1 = 0x0000555555554768  t`main + 40 at test.c:8:1
        r3 = 0x00007ffffffef780
       r22 = 0x00007ffffffef7b0
       r23 = 0x00007ffffffef918
       r24 = 0x0000000000000001
       r25 = 0x0000000000000000
       r26 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
       r27 = 0x0000555555554740  t`main at test.c:6
       r28 = 0x00007ffffffef928
       r29 = 0x00007ffff7febc88  ld-linux-loongarch-lp64d.so.1`_rtld_global_ro
       r30 = 0x000055555555be08  t`__do_global_dtors_aux_fini_array_entry
        pc = 0x0000555555554768  t`main + 40 at test.c:8:1
33 registers were unavailable.

Floating Point Registers:
       f24 = 0xffffffffffffffff
       f25 = 0xffffffffffffffff
       f26 = 0x00007ffffffef780
       f27 = 0xffffffffffffffff
       f28 = 0xffffffffffffffff
       f29 = 0xffffffffffffffff
       f30 = 0xffffffffffffffff
       f31 = 0xffffffffffffffff
33 registers were unavailable.
```

Reviewed By: SixWeining

Pull Request: llvm#120391
thetruestblue pushed a commit that referenced this pull request Jan 22, 2025
This will be sent by Arm's Guarded Control Stack extension when an
invalid return is executed.

The signal does have an address we could show, but it's the PC at which
the fault occured. The debugger has plenty of ways to show you that
already, so I've left it out.

```
(lldb) c
Process 460 resuming
Process 460 stopped
* thread #1, name = 'test', stop reason = signal SIGSEGV: control protection fault
    frame #0: 0x0000000000400784 test`main at main.c:57:1
   54  	  afunc();
   55  	  printf("return from main\n");
   56  	  return 0;
-> 57  	}
(lldb) dis
<...>
->  0x400784 <+100>: ret
```

The new test case generates the signal by corrupting the link register
then attempting to return. This will work whether we manually enable GCS
or the C library does it for us.

(in the former case you could just return from main and it would fault)
thetruestblue pushed a commit that referenced this pull request Jan 22, 2025
llvm#123877)

Reverts llvm#122811 due to buildbot breakage e.g.,
https://lab.llvm.org/buildbot/#/builders/52/builds/5421/steps/11/logs/stdio

ASan output from local re-run:
```
==2780289==ERROR: AddressSanitizer: use-after-poison on address 0x7e0b87e28d28 at pc 0x55a979a99e7e bp 0x7ffe4b18f0b0 sp 0x7ffe4b18f0a8
READ of size 1 at 0x7e0b87e28d28 thread T0
    #0 0x55a979a99e7d in getStorageClass /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:344
    #1 0x55a979a99e7d in isSectionDefinition /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/Object/COFF.h:429:9
    llvm#2 0x55a979a99e7d in getSymbols /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:54:42
    llvm#3 0x55a979a99e7d in lld::coff::writeLLDMapFile(lld::coff::COFFLinkerContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/LLDMapFile.cpp:103:40
    llvm#4 0x55a979a16879 in (anonymous namespace)::Writer::run() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:810:3
    llvm#5 0x55a979a00aac in lld::coff::writeResult(lld::coff::COFFLinkerContext&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Writer.cpp:354:15
    llvm#6 0x55a97985f7ed in lld::coff::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:2826:3
    llvm#7 0x55a97984cdd3 in lld::coff::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/COFF/Driver.cpp:97:15
    llvm#8 0x55a9797f9793 in lld::unsafeLldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>, bool) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:163:12
    llvm#9 0x55a9797fa3b6 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:188:15
    llvm#10 0x55a9797fa3b6 in void llvm::function_ref<void ()>::callback_fn<lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>)::$_0>(long) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
    llvm#11 0x55a97966cb93 in operator() /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:69:12
    llvm#12 0x55a97966cb93 in llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:426:3
    llvm#13 0x55a9797f9dc3 in lld::lldMain(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, llvm::ArrayRef<lld::DriverDef>) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/Common/DriverDispatcher.cpp:187:14
    llvm#14 0x55a979627512 in lld_main(int, char**, llvm::ToolContext const&) /usr/local/google/home/thurston/buildbot_repro/llvm-project/lld/tools/lld/lld.cpp:103:14
    llvm#15 0x55a979628731 in main /usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/tools/lld/tools/lld/lld-driver.cpp:17:10
    llvm#16 0x7ffb8b202c89 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    llvm#17 0x7ffb8b202d44 in __libc_start_main csu/../csu/libc-start.c:360:3
    llvm#18 0x55a97953ef60 in _start (/usr/local/google/home/thurston/buildbot_repro/llvm_build_asan/bin/lld+0x8fd1f60)
```
thetruestblue pushed a commit that referenced this pull request Jan 23, 2025
Prevents avoidable memory leaks.

Looks like exchange added in aa1333a
didn't take "continue" into account.

```
==llc==2150782==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x5f1b0f9ac14a in strdup llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:593:3
    #1 0x5f1b1768428d in FileToRemoveList llvm-project/llvm/lib/Support/Unix/Signals.inc:105:55
```
thetruestblue pushed a commit that referenced this pull request Mar 13, 2025
When compiling VLS SVE, the compiler often replaces VL-based offsets
with immediate-based ones. This leads to a mismatch in the allowed
addressing modes due to SVE loads/stores generally expecting immediate
offsets relative to VL. For example, given:
```c

svfloat64_t foo(const double *x) {
  svbool_t pg = svptrue_b64();
  return svld1_f64(pg, x+svcntd());
}
```

When compiled with `-msve-vector-bits=128`, we currently generate:
```gas
foo:
        ptrue   p0.d
        mov     x8, llvm#2
        ld1d    { z0.d }, p0/z, [x0, x8, lsl llvm#3]
        ret
```

Instead, we could be generating:
```gas
foo:
        ldr     z0, [x0, #1, mul vl]
        ret
```

Likewise for other types, stores, and other VLS lengths.

This patch achieves the above by extending `SelectAddrModeIndexedSVE`
to let constants through when `vscale` is known.
thetruestblue pushed a commit that referenced this pull request Mar 13, 2025
`TestReportData.py` is failing on the macOS CI with:
```
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 1784, in test_method
    return attrvalue(self)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 148, in wrapper
    return func(*args, **kwargs)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/API/functionalities/asan/TestReportData.py", line 28, in test_libsanitizers_asan
    self.asan_tests(libsanitizers=True)
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/API/functionalities/asan/TestReportData.py", line 60, in asan_tests
    self.expect(
  File "/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2490, in expect
    self.fail(log_msg)
AssertionError: Ran command:
"thread list"

Got output:
Process 3474 stopped
* thread #1: tid = 0x38b5e9, 0x00007ff80f563b52 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT

Expecting sub string: "stopped" (was found)
Expecting sub string: "stop reason = Use of deallocated memory" (was not found)
Process should be stopped due to ASan report
```

There isn't much to go off of in the log, so adding more to help us debug this.
thetruestblue pushed a commit that referenced this pull request Mar 13, 2025
These are macOS tests only and are currently failing on the x86_64 CI
and on arm64 on recent versions of macOS/Xcode.

The tests are failing because we're stopping in:
```
Process 17458 stopped
* thread #1: tid = 0xbda69a, 0x00000002735bd000
  libsystem_malloc.dylib`purgeable_print_self.cold.1, stop reason = EXC_BREAKPOINT (code=1, subcode=0x2735bd000)
```
instead of the libsanitizers library. This seems to be related to
`-fsanitize-trivial-abi` support

Skip these for now until we figure out the root cause.
thetruestblue pushed a commit that referenced this pull request Mar 24, 2025
… pointers (llvm#132261)

Currently, the helpers to get fir::ExtendedValue out of hlfir::Entity
use hlfir.declare second result (`#1`) in most cases. This is because
this result is the same as the input and matches what FIR was getting
before lowering to HLFIR.

But this creates odd situations when both hlfir.declare are raw pointers
and either result ends-up being used in the IR depending on whether the
code was generated by a helper using fir::ExtendedValue, or via "pure
HLFIR" helpers using the first result.

This will typically prevent simple CSE and easy identification that two
operation (e.g load/store) are touching the exact same memory location
without using alias analysis or "manual detection" (looking for common
hlfir.declare defining op).

Hence, when hlfir.declare results are both raw pointers, use `#0` when
producing `fir::ExtendedValue`.
When `#0` is a fir.box, keep using `#1` because these are not the same. 
The only code change is in HLFIRTools.cpp and is pretty small, but there
is a big test fallout of `#1` to `#0`.
thetruestblue pushed a commit that referenced this pull request Mar 24, 2025
…too. (llvm#132267)

Observed in Wine when trying to intercept `ExitThread`, which forwards
to `ntdll.RtlExitUserThread`.

`gdb` interprets it as `xchg %ax,%ax`.
`llvm-mc` outputs simply `nop`.

```
==Asan-i386-calls-Dynamic-Test.exe==964==interception_win: unhandled instruction at 0x7be27cf0: 66 90 55 89 e5 56 50 8b
```

```
Wine-gdb> bt
#0  0x789a1766 in __interception::GetInstructionSize (address=<optimized out>, rel_offset=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:983
#1  0x789ab480 in __sanitizer::SharedPrintfCode(bool, char const*, char*) () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_printf.cpp:311
llvm#2  0x789a18e7 in __interception::OverrideFunctionWithHotPatch (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1118
llvm#3  0x789a1f34 in __interception::OverrideFunction (old_func=2078440688, new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1224
llvm#4  0x789a24ce in __interception::OverrideFunction (func_name=0x78a0bc43 <vtable for __asan::AsanThreadContext+1163> "ExitThread", new_func=2023702608, orig_old_func=warning: (Internal error: pc 0x792f1a2c in read in CU, but not in symtab.)warning: (Error: pc 0x792f1a2c in address map, but not in symtab.)0x792f1a2c)    at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/interception/interception_win.cpp:1369
llvm#5  0x789f40ef in __asan::InitializePlatformInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_win.cpp:190
llvm#6  0x789e0c3c in __asan::InitializeAsanInterceptors () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_interceptors.cpp:802
llvm#7  0x789ee6b5 in __asan::AsanInitInternal () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:442
llvm#8  0x789eefb0 in __asan::AsanInitFromRtl () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:522
llvm#9  __asan::AsanInitializer::AsanInitializer (this=<optimized out>) at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:542
llvm#10 __cxx_global_var_init () at C:/llvm-mingw/llvm-mingw/llvm-project/compiler-rt/lib/asan/asan_rtl.cpp:546
...
Wine-gdb> disassemble /r 2078440688,2078440688+20
Dump of assembler code from 0x7be27cf0 to 0x7be27d04:
   0x7be27cf0 <_RtlExitUserThread@4+0>: 66 90                   xchg   %ax,%ax
...
```
thetruestblue pushed a commit that referenced this pull request Apr 14, 2025
…d A520 (llvm#132246)

Inefficient SVE codegen occurs on at least two in-order cores,
those being Cortex-A510 and Cortex-A520. For example a simple vector
add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence
of instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, llvm#2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
thetruestblue pushed a commit that referenced this pull request Apr 14, 2025
… A510/A520 (llvm#134606)

Recommit. This work was done by llvm#132246 but failed buildbots due to the
test introduced needing updates

Inefficient SVE codegen occurs on at least two in-order cores, those
being Cortex-A510 and Cortex-A520. For example a simple vector add

```
void foo(float a, float b, float dst, unsigned n) {
    for (unsigned i = 0; i < n; ++i)
        dst[i] = a[i] + b[i];
}
```

Vectorizes the inner loop into the following interleaved sequence of
instructions.

```
        add     x12, x1, x10
        ld1b    { z0.b }, p0/z, [x1, x10]
        add     x13, x2, x10
        ld1b    { z1.b }, p0/z, [x2, x10]
        ldr     z2, [x12, #1, mul vl]
        ldr     z3, [x13, #1, mul vl]
        dech    x11
        add     x12, x0, x10
        fadd    z0.s, z1.s, z0.s
        fadd    z1.s, z3.s, z2.s
        st1b    { z0.b }, p0, [x0, x10]
        addvl   x10, x10, llvm#2
        str     z1, [x12, #1, mul vl]
```

By adjusting the target features to prefer fixed over scalable if the
cost is equal we get the following vectorized loop.

```
         ldp q0, q3, [x11, #-16]
         subs    x13, x13, llvm#8
         ldp q1, q2, [x10, #-16]
         add x10, x10, llvm#32
         add x11, x11, llvm#32
         fadd    v0.4s, v1.4s, v0.4s
         fadd    v1.4s, v2.4s, v3.4s
         stp q0, q1, [x12, #-16]
         add x12, x12, llvm#32
```

Which is more efficient.
thetruestblue pushed a commit that referenced this pull request Apr 25, 2025
Currently, given:
```cpp
uint64_t incb(uint64_t x) {
  return x+svcntb();
}
```
LLVM generates:
```gas
incb:
        addvl   x0, x0, #1
        ret
```
Which is equivalent to:
```gas
incb:
        incb    x0
        ret
```

However, on microarchitectures like the Neoverse V2 and Neoverse V3,
the second form (with INCB) can have significantly better latency and
throughput (according to their SWOG). On the Neoverse V2, for example,
ADDVL has a latency and throughput of 2, whereas some forms of INCB
have a latency of 1 and a throughput of 4. The same applies to DECB.
This patch adds patterns to prefer the cheaper INCB/DECB forms over
ADDVL where applicable.
thetruestblue pushed a commit that referenced this pull request Apr 25, 2025
- Avoid dereferencing the end() iterator to get the end pointer, instead
calculate it explicitly
- Fixes a regression introduced in
llvm#136220.
- The windows build failure shows the following call stack:

```
 | Exception Code: 0x80000003
 |  #0 0x00007ff74bc05897 std::_Vector_const_iterator<class std::_Vector_val<struct std::_Simple_types<unsigned char>>>::operator*(void) const C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.37.32822\include\vector:52:0
 |  #1 0x00007ff74bbd3d64 `anonymous namespace'::DecoderEmitter::emitTable D:\buildbot\llvm-worker\clang-cmake-x86_64-avx512-win\llvm\llvm\utils\TableGen\DecoderEmitter.cpp:852:0
```
thetruestblue pushed a commit that referenced this pull request Apr 25, 2025
… collection (llvm#136795)

Fix a [test
failure](llvm#136236 (comment))
in llvm#136236, apply a minor renaming of statistics, and remerge. See
details below.

# Changes in llvm#136236

Currently, `DebuggerStats::ReportStatistics()` calls
`Module::GetSymtab(/*can_create=*/false)`, but then the latter calls
`SymbolFile::GetSymtab()`. This will load symbols if haven't yet. See
stacktrace below.

The problem is that `DebuggerStats::ReportStatistics` should be
read-only. This is especially important because it reports stats for
symtab parsing/indexing time, which could be affected by the reporting
itself if it's not read-only.

This patch fixes this problem by adding an optional parameter
`SymbolFile::GetSymtab(bool can_create = true)` and receiving the
`false` value passed down from `Module::GetSymtab(/*can_create=*/false)`
when the call is initiated from `DebuggerStats::ReportStatistics()`.

---

Notes about the following stacktrace:
1. This can be reproduced. Create a helloworld program on **macOS** with
dSYM, add `settings set target.preload-symbols false` to `~/.lldbinit`,
do `lldb a.out`, then `statistics dump`.
2. `ObjectFile::GetSymtab` has `llvm::call_once`. So the fact that it
called into `ObjectFileMachO::ParseSymtab` means that the symbol table
is actually being parsed.

```
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x0000000124c4d5a0 LLDB`ObjectFileMachO::ParseSymtab(this=0x0000000111504e40, symtab=0x0000600000a05e00) at ObjectFileMachO.cpp:2259:44
  * frame #1: 0x0000000124fc50a0 LLDB`lldb_private::ObjectFile::GetSymtab()::$_0::operator()(this=0x000000016d35c858) const at ObjectFile.cpp:761:9
    frame llvm#5: 0x0000000124fc4e68 LLDB`void std::__1::__call_once_proxy[abi:v160006]<std::__1::tuple<lldb_private::ObjectFile::GetSymtab()::$_0&&>>(__vp=0x000000016d35c7f0) at mutex:652:5
    frame llvm#6: 0x0000000198afb99c libc++.1.dylib`std::__1::__call_once(unsigned long volatile&, void*, void (*)(void*)) + 196
    frame llvm#7: 0x0000000124fc4dd0 LLDB`void std::__1::call_once[abi:v160006]<lldb_private::ObjectFile::GetSymtab()::$_0>(__flag=0x0000600003920080, __func=0x000000016d35c858) at mutex:670:9
    frame llvm#8: 0x0000000124fc3cb0 LLDB`void llvm::call_once<lldb_private::ObjectFile::GetSymtab()::$_0>(flag=0x0000600003920080, F=0x000000016d35c858) at Threading.h:88:5
    frame llvm#9: 0x0000000124fc2bc4 LLDB`lldb_private::ObjectFile::GetSymtab(this=0x0000000111504e40) at ObjectFile.cpp:755:5
    frame llvm#10: 0x0000000124fe0a28 LLDB`lldb_private::SymbolFileCommon::GetSymtab(this=0x0000000104865200) at SymbolFile.cpp:158:39
    frame llvm#11: 0x0000000124d8fedc LLDB`lldb_private::Module::GetSymtab(this=0x00000001113041a8, can_create=false) at Module.cpp:1027:21
    frame llvm#12: 0x0000000125125bdc LLDB`lldb_private::DebuggerStats::ReportStatistics(debugger=0x000000014284d400, target=0x0000000115808200, options=0x000000014195d6d1) at Statistics.cpp:329:30
    frame llvm#13: 0x0000000125672978 LLDB`CommandObjectStatsDump::DoExecute(this=0x000000014195d540, command=0x000000016d35d820, result=0x000000016d35e150) at CommandObjectStats.cpp:144:18
    frame llvm#14: 0x0000000124f29b40 LLDB`lldb_private::CommandObjectParsed::Execute(this=0x000000014195d540, args_string="", result=0x000000016d35e150) at CommandObject.cpp:832:9
    frame llvm#15: 0x0000000124efbd70 LLDB`lldb_private::CommandInterpreter::HandleCommand(this=0x0000000141b22f30, command_line="statistics dump", lazy_add_to_history=eLazyBoolCalculate, result=0x000000016d35e150, force_repeat_command=false) at CommandInterpreter.cpp:2134:14
    frame llvm#16: 0x0000000124f007f4 LLDB`lldb_private::CommandInterpreter::IOHandlerInputComplete(this=0x0000000141b22f30, io_handler=0x00000001419b2aa8, line="statistics dump") at CommandInterpreter.cpp:3251:3
    frame llvm#17: 0x0000000124d7b5ec LLDB`lldb_private::IOHandlerEditline::Run(this=0x00000001419b2aa8) at IOHandler.cpp:588:22
    frame llvm#18: 0x0000000124d1e8fc LLDB`lldb_private::Debugger::RunIOHandlers(this=0x000000014284d400) at Debugger.cpp:1225:16
    frame llvm#19: 0x0000000124f01f74 LLDB`lldb_private::CommandInterpreter::RunCommandInterpreter(this=0x0000000141b22f30, options=0x000000016d35e63c) at CommandInterpreter.cpp:3543:16
    frame llvm#20: 0x0000000122840294 LLDB`lldb::SBDebugger::RunCommandInterpreter(this=0x000000016d35ebd8, auto_handle_events=true, spawn_thread=false) at SBDebugger.cpp:1212:42
    frame llvm#21: 0x0000000102aa6d28 lldb`Driver::MainLoop(this=0x000000016d35ebb8) at Driver.cpp:621:18
    frame llvm#22: 0x0000000102aa75b0 lldb`main(argc=1, argv=0x000000016d35f548) at Driver.cpp:829:26
    frame llvm#23: 0x0000000198858274 dyld`start + 2840
```

# Changes in this PR top of the above

Fix a [test
failure](llvm#136236 (comment))
in `TestStats.py`. The original version of the added test checks that
all modules have symbol count zero when `target.preload-symbols ==
false`. The test failed on macOS. Due to various reasons, on macOS,
symbols can be loaded for dylibs even with that setting, but not for the
main module. For now, the fix of the test is to limit the assertion to
only the main module. The test now passes on macOS. In the future, when
we have a way to control a specific list of plug-ins to be loaded, there
may be a configuration that this test can use to assert that all modules
have symbol count zero.

Apply a minor renaming of statistics, per the
[suggestion](llvm#136226 (comment))
in llvm#136226 after merge.
thetruestblue pushed a commit that referenced this pull request Apr 25, 2025
…mbolConjured" (llvm#137304)

Reverts llvm#128251

ASAN bots reported some errors:
https://lab.llvm.org/buildbot/#/builders/55/builds/10398
Reverting for investigation.

```
Failed Tests (6):
  Clang :: Analysis/loop-widening-ignore-static-methods.cpp
  Clang :: Analysis/loop-widening-notes.cpp
  Clang :: Analysis/loop-widening-preserve-reference-type.cpp
  Clang :: Analysis/loop-widening.c
  Clang :: Analysis/loop-widening.cpp
  Clang :: Analysis/this-pointer.cpp
Testing Time: 411.55s
Total Discovered Tests: 118563
  Skipped          :     33 (0.03%)
  Unsupported      :   2015 (1.70%)
  Passed           : 116291 (98.08%)
  Expectedly Failed:    218 (0.18%)
  Failed           :      6 (0.01%)
FAILED: CMakeFiles/check-all /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/CMakeFiles/check-all 
cd /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan && /usr/bin/python3 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/./bin/llvm-lit -sv --param USE_Z3_SOLVER=0 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/mlgo-utils /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/lld/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/mlir/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/utils/lit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/test
ninja: build stopped: subcommand failed.
```

```
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c # RUN: at line 1
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
1.	<eof> parser at end of file
2.	While analyzing stack: 
	#0 Calling nested_loop_inner_widen
 #0 0x0000c894cca289cc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
 #1 0x0000c894cca23324 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 llvm#2 0x0000c894cca29bbc SignalHandler(int, siginfo_t*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 llvm#3 0x0000f6898da4a8f8 (linux-vdso.so.1+0x8f8)
 llvm#4 0x0000f6898d377608 (/lib/aarch64-linux-gnu/libc.so.6+0x87608)
 llvm#5 0x0000f6898d32cb3c raise (/lib/aarch64-linux-gnu/libc.so.6+0x3cb3c)
 llvm#6 0x0000f6898d317e00 abort (/lib/aarch64-linux-gnu/libc.so.6+0x27e00)
 llvm#7 0x0000c894c5e77fec __sanitizer::Atexit(void (*)()) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix_libcdep.cpp:168:10
 llvm#8 0x0000c894c5e76680 __sanitizer::Die() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:52:5
 llvm#9 0x0000c894c5e69650 Unlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:250:16
llvm#10 0x0000c894c5e69650 ~GenericScopedLock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_mutex.h:386:51
llvm#11 0x0000c894c5e69650 __hwasan::ScopedReport::~ScopedReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:54:5
llvm#12 0x0000c894c5e68de0 __hwasan::(anonymous namespace)::BaseReport::~BaseReport() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:476:7
llvm#13 0x0000c894c5e66b74 __hwasan::ReportTagMismatch(__sanitizer::StackTrace*, unsigned long, unsigned long, bool, bool, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_report.cpp:1091:1
llvm#14 0x0000c894c5e52cf8 Destroy /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:532:31
llvm#15 0x0000c894c5e52cf8 ~InternalMmapVector /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common.h:642:56
llvm#16 0x0000c894c5e52cf8 __hwasan::HandleTagMismatch(__hwasan::AccessInfo, unsigned long, unsigned long, void*, unsigned long*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:245:1
llvm#17 0x0000c894c5e551c8 __hwasan_tag_mismatch4 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan.cpp:764:1
llvm#18 0x0000c894c5e6a2f8 __interception::InterceptFunction(char const*, unsigned long*, unsigned long, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/interception/interception_linux.cpp:60:0
llvm#19 0x0000c894d166f664 getBlock /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:217:45
llvm#20 0x0000c894d166f664 getCFGElementRef /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:230:59
llvm#21 0x0000c894d166f664 clang::ento::ExprEngine::processCFGBlockEntrance(clang::BlockEdge const&, clang::ento::NodeBuilderWithSinks&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp:2570:45
llvm#22 0x0000c894d15f3a1c hasGeneratedNodes /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CoreEngine.h:333:37
llvm#23 0x0000c894d15f3a1c clang::ento::CoreEngine::HandleBlockEdge(clang::BlockEdge const&, clang::ento::ExplodedNode*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:319:20
llvm#24 0x0000c894d15f2c34 clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:220:7
llvm#25 0x0000c894d15f2398 operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/unique_ptr.h:267:101
llvm#26 0x0000c894d15f2398 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>)::$_0::operator()(unsigned int) const /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:140:12
llvm#27 0x0000c894d15f14b4 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp:165:7
llvm#28 0x0000c894d0ebb9dc release /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:232:9
llvm#29 0x0000c894d0ebb9dc ~IntrusiveRefCntPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:196:27
llvm#30 0x0000c894d0ebb9dc ExecuteWorkList /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h:192:5
llvm#31 0x0000c894d0ebb9dc RunPathSensitiveChecks /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:772:7
llvm#32 0x0000c894d0ebb9dc (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void>>*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:741:5
llvm#33 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseMap.h:0:0
llvm#34 0x0000c894d0eb6ee4 begin /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/DenseSet.h:187:45
llvm#35 0x0000c894d0eb6ee4 HandleDeclsCallGraph /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:516:29
llvm#36 0x0000c894d0eb6ee4 runAnalysisOnTranslationUnit /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:584:5
llvm#37 0x0000c894d0eb6ee4 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp:647:3
llvm#38 0x0000c894d18a7a38 clang::ParseAST(clang::Sema&, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Parse/ParseAST.cpp:0:13
llvm#39 0x0000c894ce81ed70 clang::FrontendAction::Execute() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1231:10
llvm#40 0x0000c894ce6f2144 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:278:42
llvm#41 0x0000c894ce6f2144 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:241:16
llvm#42 0x0000c894ce6f2144 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1058:23
llvm#43 0x0000c894cea718cc operator-> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__memory/shared_ptr.h:635:12
llvm#44 0x0000c894cea718cc getFrontendOpts /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/include/clang/Frontend/CompilerInstance.h:307:12
llvm#45 0x0000c894cea718cc clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:301:14
llvm#46 0x0000c894c5e9cf28 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/cc1_main.cpp:294:15
llvm#47 0x0000c894c5e92a9c ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:223:12
llvm#48 0x0000c894c5e902ac clang_main(int, char**, llvm::ToolContext const&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/tools/driver/driver.cpp:0:12
llvm#49 0x0000c894c5eb2e34 main /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/tools/driver/clang-driver.cpp:17:3
llvm#50 0x0000f6898d3184c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
llvm#51 0x0000f6898d318598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
llvm#52 0x0000c894c5e52a30 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang+0x6512a30)
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/tools/clang/test/Analysis/Output/loop-widening.c.script: line 2: 2870204 Aborted                 /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/clang -cc1 -internal-isystem /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/lib/clang/21/include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core,unix.Malloc,debug.ExprInspection -analyzer-max-loop 4 -analyzer-config widen-loops=true -verify -analyzer-config eagerly-assume=false /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/clang/test/Analysis/loop-widening.c
```
thetruestblue pushed a commit that referenced this pull request Apr 25, 2025
…ible (llvm#123752)

This patch adds a new option `-aarch64-enable-zpr-predicate-spills`
(which is disabled by default), this option replaces predicate spills
with vector spills in streaming[-compatible] functions.

For example:

```
str	p8, [sp, llvm#7, mul vl]            // 2-byte Folded Spill
// ...
ldr	p8, [sp, llvm#7, mul vl]            // 2-byte Folded Reload
```

Becomes:

```
mov	z0.b, p8/z, #1
str	z0, [sp]                        // 16-byte Folded Spill
// ...
ldr	z0, [sp]                        // 16-byte Folded Reload
ptrue	p4.b
cmpne	p8.b, p4/z, z0.b, #0
```

This is done to avoid streaming memory hazards between FPR/vector and
predicate spills, which currently occupy the same stack area even when
the `-aarch64-stack-hazard-size` flag is set.

This is implemented with two new pseudos SPILL_PPR_TO_ZPR_SLOT_PSEUDO
and FILL_PPR_FROM_ZPR_SLOT_PSEUDO. The expansion of these pseudos
handles scavenging the required registers (z0 in the above example) and,
in the worst case spilling a register to an emergency stack slot in the
expansion. The condition flags are also preserved around the `cmpne` in
case they are live at the expansion point.
thetruestblue pushed a commit that referenced this pull request Apr 29, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault,
instead of reporting a clean error
```
(base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda
#0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc)
#1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc)
llvm#2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628)
llvm#3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4)
llvm#4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8)
llvm#7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8)
llvm#8 0x000000019ae8c274 
Segmentation fault: 11
```


The underlying issue was that the `DeviceCompilerInstance` (used for
device-side CUDA compilation) was never initialized with a `Sema`, which
is required before constructing the `IncrementalCUDADeviceParser`.


https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32


https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31

Unlike the host-side `CompilerInstance` which runs `ExecuteAction`
inside the Interpreter constructor (thereby setting up Sema), the
device-side CI was passed into the parser uninitialized, leading to an
assertion or crash when accessing its internals.

To fix this, I refactored the `Interpreter::create` method to include an
optional `DeviceCI` parameter. If provided, we know we need to take care
of this instance too. Only then do we construct the
`IncrementalCUDADeviceParser`.
thetruestblue pushed a commit that referenced this pull request Apr 29, 2025
This were failing on Windows CI with errors like:
```
22: (lldb) bt
23: * thread #1, stop reason = breakpoint 1.1
24:  frame #0: 0x00007ff7c5e41000 TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`func at main.m:2
25:  frame #1: 0x00007ff7c5e4101c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`bar + 12 at main.m:3
26:  frame llvm#2: 0x00007ff7c5e4103c TestFrameFormatFunctionFormattedArgumentsObjC.test.tmp.objc.out`main + 16 at main.m:5
27:  custom-frame '()'
     !~~~~~~~~~~~       error: no match expected
28:  custom-frame '(__formal=<unavailable>)'
```
thetruestblue pushed a commit that referenced this pull request Apr 29, 2025
… without debug-info" (llvm#137757)

Reverts llvm#137408

This change broke `lldb/test/Shell/Unwind/split-machine-functions.test`.

The test binary has a symbol named `_Z3foov.cold` and the test expects
the backtrace to print the name of the cold part of the function like
this:

```
# SPLIT: frame #1: {{.*}}`foo() (.cold) +
```
but now it gets

```
frame #1: 0x000055555555514f split-machine-functions.test.tmp`foo() + 12
```
tstellar pushed a commit that referenced this pull request May 10, 2025
`clang-repl --cuda` was previously crashing with a segmentation fault,
instead of reporting a clean error
```
(base) anutosh491@Anutoshs-MacBook-Air bin % ./clang-repl --cuda
#0 0x0000000111da4fbc llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x150fbc)
#1 0x0000000111da31dc llvm::sys::RunSignalHandlers() (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x14f1dc)
llvm#2 0x0000000111da5628 SignalHandler(int) (/opt/local/libexec/llvm-20/lib/libLLVM.dylib+0x151628)
llvm#3 0x000000019b242de4 (/usr/lib/system/libsystem_platform.dylib+0x180482de4)
llvm#4 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#5 0x0000000107f638d0 clang::IncrementalCUDADeviceParser::IncrementalCUDADeviceParser(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, clang::CompilerInstance&, llvm::IntrusiveRefCntPtr<llvm::vfs::InMemoryFileSystem>, llvm::Error&, std::__1::list<clang::PartialTranslationUnit, std::__1::allocator<clang::PartialTranslationUnit>> const&) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x216b8d0)
llvm#6 0x0000000107f6bac8 clang::Interpreter::createWithCUDA(std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>, std::__1::unique_ptr<clang::CompilerInstance, std::__1::default_delete<clang::CompilerInstance>>) (/opt/local/libexec/llvm-20/lib/libclang-cpp.dylib+0x2173ac8)
llvm#7 0x000000010206f8a8 main (/opt/local/libexec/llvm-20/bin/clang-repl+0x1000038a8)
llvm#8 0x000000019ae8c274
Segmentation fault: 11
```

The underlying issue was that the `DeviceCompilerInstance` (used for
device-side CUDA compilation) was never initialized with a `Sema`, which
is required before constructing the `IncrementalCUDADeviceParser`.

https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/DeviceOffload.cpp#L32

https://github.com/llvm/llvm-project/blob/89687e6f383b742a3c6542dc673a84d9f82d02de/clang/lib/Interpreter/IncrementalParser.cpp#L31

Unlike the host-side `CompilerInstance` which runs `ExecuteAction`
inside the Interpreter constructor (thereby setting up Sema), the
device-side CI was passed into the parser uninitialized, leading to an
assertion or crash when accessing its internals.

To fix this, I refactored the `Interpreter::create` method to include an
optional `DeviceCI` parameter. If provided, we know we need to take care
of this instance too. Only then do we construct the
`IncrementalCUDADeviceParser`.

(cherry picked from commit 21fb19f)
tstellar pushed a commit that referenced this pull request May 10, 2025
llvm#138091)

Check this error for more context
(https://github.com/compiler-research/CppInterOp/actions/runs/14749797085/job/41407625681?pr=491#step:10:531)

This fails with
```
* thread #1, name = 'CppInterOpTests', stop reason = signal SIGSEGV: address not mapped to object (fault address: 0x55500356d6d3)
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830
    frame llvm#2: 0x00007fffee20917a libclangCppInterOp.so.21.0gitstd::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() + 58
    frame llvm#3: 0x00007fffee224796 libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 838
    frame llvm#4: 0x00007fffee22494d libclangCppInterOp.so.21.0gitclang::CompilerInstance::~CompilerInstance() + 13
    frame llvm#5: 0x00007fffed95ec62 libclangCppInterOp.so.21.0gitclang::IncrementalCUDADeviceParser::~IncrementalCUDADeviceParser() + 98
    frame llvm#6: 0x00007fffed9551b6 libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 102
    frame llvm#7: 0x00007fffed95598d libclangCppInterOp.so.21.0gitclang::Interpreter::~Interpreter() + 13
    frame llvm#8: 0x00007fffed9181e7 libclangCppInterOp.so.21.0gitcompat::createClangInterpreter(std::vector<char const*, std::allocator<char const*>>&) + 2919
```

Problem :

1) The destructor currently handles no clearance for the DeviceParser
and the DeviceAct. We currently only have this

https://github.com/llvm/llvm-project/blob/976493822443c52a71ed3c67aaca9a555b20c55d/clang/lib/Interpreter/Interpreter.cpp#L416-L419

2) The ownership for DeviceCI currently is present in
IncrementalCudaDeviceParser. But this should be similar to how the
combination for hostCI, hostAction and hostParser are managed by the
Interpreter. As on master the DeviceAct and DeviceParser are managed by
the Interpreter but not DeviceCI. This is problematic because :
IncrementalParser holds a Sema& which points into the DeviceCI. On
master, DeviceCI is destroyed before the base class ~IncrementalParser()
runs, causing Parser::reset() to access a dangling Sema (and as Sema
holds a reference to Preprocessor which owns PragmaNamespace) we see
this
```
  * frame #0: 0x00007fffee41cfe3 libclangCppInterOp.so.21.0gitclang::PragmaNamespace::~PragmaNamespace() + 99
    frame #1: 0x00007fffee435666 libclangCppInterOp.so.21.0gitclang::Preprocessor::~Preprocessor() + 3830

```

(cherry picked from commit 529b6fc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.