Skip to content

Native AOT ARM32 lanes failing after docker update #113609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MichalStrehovsky opened this issue Mar 17, 2025 · 5 comments · Fixed by #113637
Closed

Native AOT ARM32 lanes failing after docker update #113609

MichalStrehovsky opened this issue Mar 17, 2025 · 5 comments · Fixed by #113637
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@MichalStrehovsky
Copy link
Member

ARM32 outerloop is pretty broken after #113248: https://dev.azure.com/dnceng-public/public/_build?definitionId=265&_a=summar

I've submitted a revert at #113598 to confirm this is caused by the docker toolset update.

I pulled down a dump with runfo and this is a Signal SIGBUS (Bus error) code BUS_ADRALN (Invalid address alignment) at 0x441a933 with stacks like

0:000> k
 # Child-SP RetAddr      Call Site
00 (Inline) --------     HardwareIntrinsics_General_r!VarInt::ReadUnsigned+0x12 [/__w/1/s/src/coreclr/nativeaot/Runtime/inc/varint.h @ 11] 
01 fff526e0 0237d51b     HardwareIntrinsics_General_r!UnixNativeCodeManager::EHEnumNext+0x82 [/__w/1/s/src/coreclr/nativeaot/Runtime/./../../vm/gcinfodecoder.cpp @ 1460] 
02 fff52700 0275db37     HardwareIntrinsics_General_r!RhpEHEnumNext+0x13
03 fff52708 0275d7e1     HardwareIntrinsics_General_r!S_P_CoreLib_System_Runtime_EH::FindFirstPassHandler+0x76 [/_/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 1024] 
04 fff527c8 0275d5d1     HardwareIntrinsics_General_r!S_P_CoreLib_System_Runtime_EH::DispatchEx+0x150 [/_/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 807] 
05 fff527e0 023c30a3     HardwareIntrinsics_General_r!S_P_CoreLib_System_Runtime_EH::RhThrowHwEx+0xf0 [/_/src/coreclr/nativeaot/Runtime.Base/src/System/Runtime/ExceptionHandling.cs @ 640] 
06 fff52988 fff52c44     HardwareIntrinsics_General_r!RhpThrowHwEx2+0x1 [/__w/1/s/src/coreclr/nativeaot/Runtime/arm/ExceptionHandling.S @ 72] 
07 fff52b20 00000000     0xfff52c44

It's not clear to me why the stack is broken. But unaligned access sounds reasonable, this data is definitely not 4-byte aligned (we do not align the EH info blob and most of the blob are variable length integers, making the initial blob alignment meaningless anyway)

It is not clear to my why this is a problem now and whether this is the only spot where we have a problem and how it could be caused by a toolset update.

Cc @dotnet/ilc-contrib

Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Mar 17, 2025

It is not clear to my why this is a problem now

It is not unusual to get breaks like this on major compiler version updates that come with new optimizations and uncover latent bugs.

cc @richlander FYI: More clang 20 breaks.

jkotas added a commit to jkotas/runtime that referenced this issue Mar 18, 2025
The fix replaces the native implementation of the native format decoder with the clone of the managed implementation.

Fixes dotnet#113609
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Mar 18, 2025
@jkotas
Copy link
Member

jkotas commented Mar 18, 2025

new optimizations

clang20 fuses two sequential int32 reads into a single ldrd r4,r5,[r0,#-4] instruction that requires address to be aligned. Earlier versions produced two individual ldr instructions that do not require address to be aligned, but it is not something one can depend on in portable code.

@richlander
Copy link
Member

I'm guessing that we (me) can already start the new PR (revert again) since the PR was merged today. Right?

@jkotas
Copy link
Member

jkotas commented Mar 18, 2025

I have a pending PR with the fix for the known Arm32 NAOT issue: #113637

It makes sense to try again once that fix is merged.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 17, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-NativeAOT-coreclr in-pr There is an active PR which will close this issue when it is merged
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants