Skip to content

[X86] Fix RegAlloc issue by implementing TRI::getCustomEHPadPreservedMask #135518

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

s-barannikov
Copy link
Contributor

@s-barannikov s-barannikov commented Apr 13, 2025

This method needs to be implemented so that the register allocator can
correctly split live ranges of loop invariant virtual registers.
This default return value is correct for DWARF unwinder, which preserves
all registers (except exception pointer/selector), but not for SjLj,
which only preserves sp/fp and possibly bp (they should be reserved).

In the added test rcx on the entry to the loop contains the argument
to foo. It survives the call due to preserve_allcc, but not the
unwind edge entering LBB0_3, and so must be recomputed before going to
the next iteration.
Similarly, rdx holds the address of a jump table and is recomputed on
every iteration rather than once before the loop as was before regalloc.

This method needs to be implemented so that the register allocator can
correctly split live ranges of loop invariant virtual registers.
This default return value is correct for DWARF unwinder, which preserves
all registers (except exception pointer/selector), but not for SjLj,
which only preserves sp/fp and possibly bp (they should be reserved).

In the added test `rcx` on the entry to the loop contains the argument
to `foo`. It survives the call due to `preserve_allcc`, but not the
unwind edge entering LBB0_3, and so must be recomputed before going to
the next iteration.
Similarly, `rdx` holds the address of a jump table and is recomputed on
every iteration rather than once before the loop.
@llvmbot
Copy link
Member

llvmbot commented Apr 13, 2025

@llvm/pr-subscribers-backend-x86

Author: Sergei Barannikov (s-barannikov)

Changes

This method needs to be implemented so that the register allocator can
correctly split live ranges of loop invariant virtual registers.
This default return value is correct for DWARF unwinder, which preserves
all registers (except exception pointer/selector), but not for SjLj,
which only preserves sp/fp and possibly bp (they should be reserved).

In the added test rcx on the entry to the loop contains the argument
to foo. It survives the call due to preserve_allcc, but not the
unwind edge entering LBB0_3, and so must be recomputed before going to
the next iteration.
Similarly, rdx holds the address of a jump table and is recomputed on
every iteration rather than once before the loop.


Full diff: https://github.com/llvm/llvm-project/pull/135518.diff

3 Files Affected:

  • (modified) llvm/lib/Target/X86/X86RegisterInfo.cpp (+7)
  • (modified) llvm/lib/Target/X86/X86RegisterInfo.h (+2)
  • (added) llvm/test/CodeGen/X86/sjlj-unwind-clobber.ll (+76)
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index 1c4114f8cc9c6..68c227cb987de 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -525,6 +525,13 @@ X86RegisterInfo::getNoPreservedMask() const {
   return CSR_NoRegs_RegMask;
 }
 
+const uint32_t *
+X86RegisterInfo::getCustomEHPadPreservedMask(const MachineFunction &MF) const {
+  if (MF.getTarget().Options.ExceptionModel == ExceptionHandling::SjLj)
+    return getNoPreservedMask();
+  return TargetRegisterInfo::getCustomEHPadPreservedMask(MF);
+}
+
 const uint32_t *X86RegisterInfo::getDarwinTLSCallPreservedMask() const {
   return CSR_64_TLS_Darwin_RegMask;
 }
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.h b/llvm/lib/Target/X86/X86RegisterInfo.h
index 5b6ac3c5da019..d9491cd225827 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.h
+++ b/llvm/lib/Target/X86/X86RegisterInfo.h
@@ -102,6 +102,8 @@ class X86RegisterInfo final : public X86GenRegisterInfo {
   const uint32_t *getCallPreservedMask(const MachineFunction &MF,
                                        CallingConv::ID) const override;
   const uint32_t *getNoPreservedMask() const override;
+  const uint32_t *
+  getCustomEHPadPreservedMask(const MachineFunction &MF) const override;
 
   // Calls involved in thread-local variable lookup save more registers than
   // normal calls, so they need a different mask to represent this.
diff --git a/llvm/test/CodeGen/X86/sjlj-unwind-clobber.ll b/llvm/test/CodeGen/X86/sjlj-unwind-clobber.ll
new file mode 100644
index 0000000000000..c1d718a9f7031
--- /dev/null
+++ b/llvm/test/CodeGen/X86/sjlj-unwind-clobber.ll
@@ -0,0 +1,76 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=x86_64 -exception-model=sjlj -o - %s | FileCheck %s
+
+declare dso_local i32 @__gxx_personality_sj0(...)
+declare dso_local preserve_allcc void @foo(ptr)
+
+define void @test() personality ptr @__gxx_personality_sj0 {
+; CHECK-LABEL: test:
+; CHECK:       # %bb.0: # %entry
+; CHECK-NEXT:    pushq %rbp
+; CHECK-NEXT:    movq %rsp, %rbp
+; CHECK-NEXT:    pushq %r15
+; CHECK-NEXT:    pushq %r14
+; CHECK-NEXT:    pushq %r13
+; CHECK-NEXT:    pushq %r12
+; CHECK-NEXT:    pushq %rbx
+; CHECK-NEXT:    subq $104, %rsp
+; CHECK-NEXT:    movq $__gxx_personality_sj0, -104(%rbp)
+; CHECK-NEXT:    movq $GCC_except_table0, -96(%rbp)
+; CHECK-NEXT:    movq %rbp, -88(%rbp)
+; CHECK-NEXT:    movq %rsp, -72(%rbp)
+; CHECK-NEXT:    movq $.LBB0_3, -80(%rbp)
+; CHECK-NEXT:    leaq -136(%rbp), %rdi
+; CHECK-NEXT:    callq _Unwind_SjLj_Register@PLT
+; CHECK-NEXT:    leaq -44(%rbp), %rcx
+; CHECK-NEXT:  .LBB0_1: # %while.cond
+; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
+; CHECK-NEXT:    movl $1, -128(%rbp)
+; CHECK-NEXT:  .Ltmp0:
+; CHECK-NEXT:    movq %rcx, %rdi
+; CHECK-NEXT:    callq foo
+; CHECK-NEXT:  .Ltmp1:
+; CHECK-NEXT:    jmp .LBB0_2
+; CHECK-NEXT:  .LBB0_3: # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:    leaq -44(%rbp), %rcx
+; CHECK-NEXT:    movl -128(%rbp), %eax
+; CHECK-NEXT:    cmpl $1, %eax
+; CHECK-NEXT:    jae .LBB0_5
+; CHECK-NEXT:  # %bb.4: # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:    leaq .LJTI0_0(%rip), %rdx
+; CHECK-NEXT:    jmpq *(%rdx,%rax,8)
+; CHECK-NEXT:  .LBB0_6: # %lpad
+; CHECK-NEXT:    # in Loop: Header=BB0_1 Depth=1
+; CHECK-NEXT:  .Ltmp2:
+; CHECK-NEXT:    movl -124(%rbp), %eax
+; CHECK-NEXT:    movl -120(%rbp), %eax
+; CHECK-NEXT:    jmp .LBB0_1
+; CHECK-NEXT:  .LBB0_2: # %while.end
+; CHECK-NEXT:    leaq -136(%rbp), %rdi
+; CHECK-NEXT:    callq _Unwind_SjLj_Unregister@PLT
+; CHECK-NEXT:    addq $104, %rsp
+; CHECK-NEXT:    popq %rbx
+; CHECK-NEXT:    popq %r12
+; CHECK-NEXT:    popq %r13
+; CHECK-NEXT:    popq %r14
+; CHECK-NEXT:    popq %r15
+; CHECK-NEXT:    popq %rbp
+; CHECK-NEXT:    retq
+; CHECK-NEXT:  .LBB0_5:
+; CHECK-NEXT:    ud2
+entry:
+  %ptr = alloca i32, align 4
+  br label %while.cond
+
+while.cond:
+  invoke preserve_allcc void @foo(ptr %ptr)
+          to label %while.end unwind label %lpad
+
+lpad:
+  %lp = landingpad { ptr, i32 }
+          catch ptr null
+  br label %while.cond
+
+while.end:
+  ret void
+}

@s-barannikov
Copy link
Contributor Author

s-barannikov commented Apr 13, 2025

The first commit contains a test. I can craft a mir test instead if necessary, but it is going to be much less readable.

The test is a stripped down version of this C++ source compiled to LLVM IR with -O1 -fsjlj-exceptions

[[clang::preserve_all]]
void foo(int *ptr);

void test() {
    int x;
    while (true) {
        try {
            foo(&x);
        } catch (...) {
            continue;
        }
        break;
    }
}

@@ -525,6 +525,13 @@ X86RegisterInfo::getNoPreservedMask() const {
return CSR_NoRegs_RegMask;
}

const uint32_t *
X86RegisterInfo::getCustomEHPadPreservedMask(const MachineFunction &MF) const {
if (MF.getTarget().Options.ExceptionModel == ExceptionHandling::SjLj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not familar with SjLj. Is the following the same rule on both Linux and Windows?

Copy link
Contributor Author

@s-barannikov s-barannikov Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! I never thought about supporting SJLJ on Windows, but I think it should be the same there.
Unlike DWARF unwinder, SJLJ implementation does not actually unwind the stack (does not step through frames), it just does __builtin_longjmp to the dispatch block. The builtin only restores program counter and stack pointer(s), as that is all that is saved at the beginning of each function that can potentially throw an exception. (Saving all registers at the beginning of each function would be too expensive.)

Also, there is a related workaround that adds the same "no-preserved" mask to an arbitrary instruction in the dispatch block. (It should be removed and any remaining issues fixed in the generic code, but I'm hesitant to do this as test SJLJ test coverage is pretty poor.)


Apparently, the link doesn't work, look for

  // Add a register mask with no preserved registers.  This results in all
  // registers being marked as clobbered.

in X86ISelLowering.cpp

Copy link
Contributor

@phoebewang phoebewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give +1 here since I don't see any blocker issue. It's better if someone who is familiar with SjLj takes another look.

@s-barannikov
Copy link
Contributor Author

Thanks!

@s-barannikov
Copy link
Contributor Author

Maybe someone from @llvm/pr-subscribers-backend-arm can help with the review? IIUC SJLJ-style exception handling is the default on ARM-Darwin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants