-
Notifications
You must be signed in to change notification settings - Fork 314
Different Relocation Handling for Function Pointers between LLVM and GCC in ARM32 #1441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have located the root cause of this issue. kernel's apply_relocate function processes relocations in the following sequence:
After switching to LLVM compiler, there's a significant change in how function pointer relocations are handled: LLVM's approach:
GCC's approach:
Current kpatch tool's processing (problematic):
The issue is that kpatch is not properly handling the compiler-specific differences: LLVM stores the offset in the section content (4) with sym.value = 0 |
I have adapted the kpatch code in the kpatch_replace_sections_syms function within create-diff-object.c. The solution is to skip symbol replacement when detecting function pointer relocations. This effectively resolves the issue. Here's the detailed patch:
The key change is adding a check to skip symbol replacement for relocations in .data sections, which handles the LLVM-specific function pointer relocation case while maintaining compatibility with GNU-style relocations. |
Hi @PuzzleZz thanks for the detailed report! Disclaimer: upstream kpatch doesn't currently support ARM32 and llvm is currently a best-effort. But the reported behavior is nonetheless interesting to learn. Do you know if llvm/ARM64 is similarly affected? Can you share which kpatch fork and kernel changes that you are running with? And finally, is the presence of the code (provided in the first comment) in a kpatch sufficient enough to hit this issue? Thanks. |
ARM64 Has Similar Relocation Changes with LLVM After switching to LLVM on ARM64, we observe similar relocation behavior changes:
While the kpatch functionality currently works because the section content is 0 in both cases (resulting in the same final address calculation), I believe we should maintain LLVM's relocation format to prevent potential future issues.
I've only adapted the kpatch tools for LLVM compilation. For reference, we're using: Finally.Here's how to reproduce the issue using a kernel module hot patch:
Patch file:
To reproduce: 1.The patch registers a hook function that triggers lkp_module_notifier when kernel module changes occur |
Without arm32 (or 64) support in this project, I can't really reproduce it and when I look at non-kpatch module builds on x86/arm64 using both gcc/llvm, I see the same section-based relocations for pointers to local functions, and direct symbol relocations for global functions. This was using gcc (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) and clang version 19.1.7 (Red Hat, Inc. 19.1.7-2.el10) on both arches. Perhaps other versions generate things differently? (Also, to clarify the report, can you identify which object file is dumped? I understand the report as the compilers generating two different kinds of relocations for the patched .o file and then kpatch-build's rela mangling screws up the section-based one in the output.o file.) Finally which fork of kpatch-build are you using that has this support ( Perhaps they have encountered this? @puranjaymohan , @liu-song-6 : have you seen this behavior in your kpatch for ARM64 experiences? |
Let me clarify this issue more thoroughly. Summary of the Problem: My proposed patch solution identifies these scenarios and skips forcing them into the GCC format. Here's a simple kernel module to demonstrate the GCC vs LLVM differences:
GCC output:
LLVM output:
As shown in these results, there's a clear difference in how GCC and LLVM handle function pointer relocations. For LLVM-compiled scenarios, it's best to adapt the kpatch_replace_sections_syms function to handle these cases by not performing the replacement. You can try to reproduce this issue using the tools maintained in this repository: |
Thanks for hanging with me @PuzzleZz :) here are some things that I think I've learned today:
Using your kpatch_hook.c example, I built the example module for Arm, Arm64, and x86_64 using both llvm and gcc. The only combination that generated a symbol based relocation
That said, if I change
I've looked at a few architecture specific relocation types: Arm, PowerPC, s390x, and x86_64 and it's interesting that only Arm defines the addend as (italics mine):
and sure enough, you highlight that in arch/arm/kernel/module.c :: apply_relocate()'s handling of case R_ARM_ABS32:
case R_ARM_TARGET1:
*(u32 *)loc += sym->st_value;
break; (Some more reading of the System V Application Binary Interface notes that "Entries of type Elf32_Rel and Elf64_Rel store an implicit addend in the location to be modified." So it's really just a SHT_REL vs. SHT_RELA thing across arches and only the Arm spec reminds us of that.) I could be wrong, but it seems like all other arches (including Arm64) are generating SHT_RELA for function pointers, so the section contents has never mattered. So even though Arm64 creates R_AARCH64_ABS64, it seems like the symbol value / addend / section content are still consistent. I don't see the same
From my readelf output pasted above, in all of my cases both the symbol value and section content are zero. I'm not sure why or how to force a different value in either. Perhaps your kernel configuration would help me reproduce that?
The proposed patch short circuits Instead of skipping them, would it make sense to fix the mangling, so that it accounts to for these type of relocations that depend on the section contents? |
@PuzzleZz FYI, I had detected a problem under ARM64 of kpatch. On kernel-6.6 aarch64, kpatch-build will have ld issue that '__patchable_function_entries' have both ordered and unordered issues, which will make ld process die. This problem I am still working on it. I dont know if you had met this problem before, or something related with this issue. As the email I had sent you before. @joe-lawrence |
@wardenjohn : I have not encountered that linker problem in my relatively small Arm experiences nor PowerPC 64 (for which RHEL-10 config will use __patchable_function_entries as well). I think you will have better luck over on the #1439 thread (are you testing with this branch?) than here, as #1441 is strictly about function pointer relocation types. |
@joe-lawrence Sorry I've been a bit busy lately and just saw your message. I suspect that the lack of offset in your results is because you haven't enabled forward CFI. We will enable this feature and insert 4 bytes of checksum information at the beginning of each function. So, I suspect this offset is introduced here. |
@joe-lawrence I think you can try gcc version: |
@joe-lawrence |
@PuzzleZz : no worries, investigating this is rather time intensive on my part as I have to setup cross compilers and pick through a completely new architecture (to me). I haven't made much progress from my last comment, other than enabling CONFIG_CFI_CLANG for an llvm Arm(32) build of a similar function-pointer test module:
which I think finally reproduces the original report, right (for pointers to locally scoped functions)? Anyway, I'm hesitant to change this singular function to handle what I believe is a SHT_REL vs. SHT_RELA issue when I think the rest of the code base assumes SHT_RELA as it's only been coded for 64-bit arches (which don't seem to use SHT_REL in most kernel objects). I'll keep poking at it, but in the meantime, someone like @jpoimboe would be more knowledgeable about the relocations and whether mangling / normalizing function pointer relocations really matter for "correlation and linking to vmlinux" as the code comments imply. |
This issue has been open for 30 days with no activity and no assignee. It will be closed in 7 days unless a comment is added. |
This issue was closed because it was inactive for 7 days after being marked stale. |
After switching to LLVM, we noticed different behavior in handling relocations, specifically for function pointer relocations. Here's a detailed example:
LLVM's relocation handling for lkp_module_notifier:
GCC's relocation handling for lkp_module_notifier:
However, our kpatch tool processes relocations following the GCC format:
LLVM uses section symbol (.text.lkp_module_notifier) for relocation
GCC uses direct function symbol (lkp_module_notifier)
Our kpatch tool expects GCC-style relocations
The text was updated successfully, but these errors were encountered: