Skip to content

Distributed transactions: AccessViolationException on arm64 #74170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BruceForstall opened this issue Aug 18, 2022 · 20 comments · Fixed by #74226 or #74747
Closed

Distributed transactions: AccessViolationException on arm64 #74170

BruceForstall opened this issue Aug 18, 2022 · 20 comments · Fixed by #74226 or #74747
Assignees
Labels
arch-arm64 area-System.Transactions disabled-test The test is disabled in source code against the issue os-windows
Milestone

Comments

@BruceForstall
Copy link
Contributor

BruceForstall commented Aug 18, 2022

Happens a lot -- see last 30 days:

  • 8/25 - 81x failures on net7.0-windows-Release-arm64-CoreCLR_release-Windows.10.Arm64.Open
    • First failure on 8/12 in PR 1939026
    • Impact: 6x failures per day - blocking CI

win-arm64 libraries random JitStress failure

D:\h\w\AFDD09E1\w\B09109AB\e>set COMPlus 
COMPlus_JitStress=36f
COMPlus_TieredCompilation=0

D:\h\w\AFDD09E1\w\B09109AB\e>call RunTests.cmd --runtime-path D:\h\w\AFDD09E1\p 
----- start Thu 08/18/2022  0:42:25.05 ===============  To repro directly: ===================================================== 
pushd D:\h\w\AFDD09E1\w\B09109AB\e\
"D:\h\w\AFDD09E1\p\dotnet.exe" exec --runtimeconfig System.Transactions.Local.Tests.runtimeconfig.json --depsfile System.Transactions.Local.Tests.deps.json xunit.console.dll System.Transactions.Local.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================

D:\h\w\AFDD09E1\w\B09109AB\e>"D:\h\w\AFDD09E1\p\dotnet.exe" exec --runtimeconfig System.Transactions.Local.Tests.runtimeconfig.json --depsfile System.Transactions.Local.Tests.deps.json xunit.console.dll System.Transactions.Local.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing  
  Discovering: System.Transactions.Local.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Transactions.Local.Tests (found 131 of 142 test cases)
  Starting:    System.Transactions.Local.Tests (parallel test collections = on, max threads = 8)
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at System.Transactions.DtcProxyShim.DtcInterfaces.IResourceManagerFactory2.CreateEx(System.Guid, System.String, System.Transactions.DtcProxyShim.DtcInterfaces.IResourceManagerSink, System.Guid, System.Object ByRef)
--------------------------------
   at System.Transactions.DtcProxyShim.DtcProxyShimFactory+<>c__DisplayClass10_0.<ConnectToProxy>b__2()
   at System.Transactions.DtcProxyShim.OletxHelper.Retry(System.Action)
   at System.Transactions.DtcProxyShim.DtcProxyShimFactory.ConnectToProxy(System.String, System.Guid, System.Object, Boolean ByRef, Byte[] ByRef, System.Transactions.DtcProxyShim.ResourceManagerShim ByRef)
   at System.Transactions.Oletx.DtcTransactionManager.Initialize()
   at System.Transactions.Oletx.DtcTransactionManager.get_ProxyShimFactory()
   at System.Transactions.Oletx.OletxTransactionManager.CreateTransaction(System.Transactions.TransactionOptions)
   at System.Transactions.TransactionStatePromoted.EnterState(System.Transactions.InternalTransaction)
   at System.Transactions.EnlistableStates.EnlistDurable(System.Transactions.InternalTransaction, System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions, System.Transactions.Transaction)
   at System.Transactions.Transaction.EnlistDurable(System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions)
   at System.Transactions.Tests.OleTxTests+OleTxFixture..ctor()
   at System.RuntimeMethodHandle.InvokeMethod(System.Object, Void**, System.Signature, Boolean)
   at System.Reflection.ConstructorInvoker.Invoke(System.Object, IntPtr*, System.Reflection.BindingFlags)
   at System.Reflection.RuntimeConstructorInfo.Invoke(System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
   at Xunit.Sdk.XunitTestClassRunner+<>c__DisplayClass10_0.<CreateClassFixture>b__3()
   at Xunit.Sdk.ExceptionAggregator.Run(System.Action)
   at Xunit.Sdk.XunitTestClassRunner+<CreateClassFixtureAsync>d__11.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.XunitTestClassRunner+<CreateClassFixtureAsync>d__11, xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<CreateClassFixtureAsync>d__11 ByRef)
   at Xunit.Sdk.XunitTestClassRunner.CreateClassFixtureAsync(System.Type)
   at Xunit.Sdk.XunitTestClassRunner+<AfterTestClassStartingAsync>d__13.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.XunitTestClassRunner+<AfterTestClassStartingAsync>d__13, xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<AfterTestClassStartingAsync>d__13 ByRef)
   at Xunit.Sdk.XunitTestClassRunner.AfterTestClassStartingAsync()
   at Xunit.Sdk.TestClassRunner`1+<RunAsync>d__37[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestClassRunner`1+<RunAsync>d__37[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__37<System.__Canon> ByRef)
   at Xunit.Sdk.TestClassRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunAsync()
   at Xunit.Sdk.TestCollectionRunner`1+<RunTestClassesAsync>d__28[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestCollectionRunner`1+<RunTestClassesAsync>d__28[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunTestClassesAsync>d__28<System.__Canon> ByRef)
   at Xunit.Sdk.TestCollectionRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunTestClassesAsync()
   at Xunit.Sdk.TestCollectionRunner`1+<RunAsync>d__27[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestCollectionRunner`1+<RunAsync>d__27[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__27<System.__Canon> ByRef)
   at Xunit.Sdk.TestCollectionRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunAsync()
   at System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)
   at System.Threading.Tasks.Task.ExecuteEntry()
   at System.Threading.Tasks.SynchronizationContextTaskScheduler+<>c.<.cctor>b__8_0(System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext(System.Threading.SendOrPostCallback, System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc()
   at Xunit.Sdk.XunitWorkerThread+<>c.<QueueUserWorkItem>b__5_0(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)

@dotnet/jit-contrib

@BruceForstall BruceForstall added arch-arm64 os-windows JitStress CLR JIT issues involving JIT internal stress modes area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Aug 18, 2022
@BruceForstall BruceForstall added this to the 7.0.0 milestone Aug 18, 2022
@ghost
Copy link

ghost commented Aug 18, 2022

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

win-arm64 libraries random JitStress failure

D:\h\w\AFDD09E1\w\B09109AB\e>set COMPlus 
COMPlus_JitStress=36f
COMPlus_TieredCompilation=0

D:\h\w\AFDD09E1\w\B09109AB\e>call RunTests.cmd --runtime-path D:\h\w\AFDD09E1\p 
----- start Thu 08/18/2022  0:42:25.05 ===============  To repro directly: ===================================================== 
pushd D:\h\w\AFDD09E1\w\B09109AB\e\
"D:\h\w\AFDD09E1\p\dotnet.exe" exec --runtimeconfig System.Transactions.Local.Tests.runtimeconfig.json --depsfile System.Transactions.Local.Tests.deps.json xunit.console.dll System.Transactions.Local.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing 
popd
===========================================================================================================

D:\h\w\AFDD09E1\w\B09109AB\e>"D:\h\w\AFDD09E1\p\dotnet.exe" exec --runtimeconfig System.Transactions.Local.Tests.runtimeconfig.json --depsfile System.Transactions.Local.Tests.deps.json xunit.console.dll System.Transactions.Local.Tests.dll -xml testResults.xml -nologo -nocolor -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing  
  Discovering: System.Transactions.Local.Tests (method display = ClassAndMethod, method display options = None)
  Discovered:  System.Transactions.Local.Tests (found 131 of 142 test cases)
  Starting:    System.Transactions.Local.Tests (parallel test collections = on, max threads = 8)
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Repeat 2 times:
--------------------------------
   at System.Transactions.DtcProxyShim.DtcInterfaces.IResourceManagerFactory2.CreateEx(System.Guid, System.String, System.Transactions.DtcProxyShim.DtcInterfaces.IResourceManagerSink, System.Guid, System.Object ByRef)
--------------------------------
   at System.Transactions.DtcProxyShim.DtcProxyShimFactory+<>c__DisplayClass10_0.<ConnectToProxy>b__2()
   at System.Transactions.DtcProxyShim.OletxHelper.Retry(System.Action)
   at System.Transactions.DtcProxyShim.DtcProxyShimFactory.ConnectToProxy(System.String, System.Guid, System.Object, Boolean ByRef, Byte[] ByRef, System.Transactions.DtcProxyShim.ResourceManagerShim ByRef)
   at System.Transactions.Oletx.DtcTransactionManager.Initialize()
   at System.Transactions.Oletx.DtcTransactionManager.get_ProxyShimFactory()
   at System.Transactions.Oletx.OletxTransactionManager.CreateTransaction(System.Transactions.TransactionOptions)
   at System.Transactions.TransactionStatePromoted.EnterState(System.Transactions.InternalTransaction)
   at System.Transactions.EnlistableStates.EnlistDurable(System.Transactions.InternalTransaction, System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions, System.Transactions.Transaction)
   at System.Transactions.Transaction.EnlistDurable(System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions)
   at System.Transactions.Tests.OleTxTests+OleTxFixture..ctor()
   at System.RuntimeMethodHandle.InvokeMethod(System.Object, Void**, System.Signature, Boolean)
   at System.Reflection.ConstructorInvoker.Invoke(System.Object, IntPtr*, System.Reflection.BindingFlags)
   at System.Reflection.RuntimeConstructorInfo.Invoke(System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
   at Xunit.Sdk.XunitTestClassRunner+<>c__DisplayClass10_0.<CreateClassFixture>b__3()
   at Xunit.Sdk.ExceptionAggregator.Run(System.Action)
   at Xunit.Sdk.XunitTestClassRunner+<CreateClassFixtureAsync>d__11.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.XunitTestClassRunner+<CreateClassFixtureAsync>d__11, xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<CreateClassFixtureAsync>d__11 ByRef)
   at Xunit.Sdk.XunitTestClassRunner.CreateClassFixtureAsync(System.Type)
   at Xunit.Sdk.XunitTestClassRunner+<AfterTestClassStartingAsync>d__13.MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.XunitTestClassRunner+<AfterTestClassStartingAsync>d__13, xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<AfterTestClassStartingAsync>d__13 ByRef)
   at Xunit.Sdk.XunitTestClassRunner.AfterTestClassStartingAsync()
   at Xunit.Sdk.TestClassRunner`1+<RunAsync>d__37[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestClassRunner`1+<RunAsync>d__37[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__37<System.__Canon> ByRef)
   at Xunit.Sdk.TestClassRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunAsync()
   at Xunit.Sdk.TestCollectionRunner`1+<RunTestClassesAsync>d__28[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestCollectionRunner`1+<RunTestClassesAsync>d__28[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunTestClassesAsync>d__28<System.__Canon> ByRef)
   at Xunit.Sdk.TestCollectionRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunTestClassesAsync()
   at Xunit.Sdk.TestCollectionRunner`1+<RunAsync>d__27[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Xunit.Sdk.TestCollectionRunner`1+<RunAsync>d__27[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], xunit.execution.dotnet, Version=2.4.2.0, Culture=neutral, PublicKeyToken=8d05b1bb7a6fdb6c]](<RunAsync>d__27<System.__Canon> ByRef)
   at Xunit.Sdk.TestCollectionRunner`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].RunAsync()
   at System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)
   at System.Threading.Tasks.Task.ExecuteEntry()
   at System.Threading.Tasks.SynchronizationContextTaskScheduler+<>c.<.cctor>b__8_0(System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext.RunOnSyncContext(System.Threading.SendOrPostCallback, System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at Xunit.Sdk.MaxConcurrencySyncContext.WorkerThreadProc()
   at Xunit.Sdk.XunitWorkerThread+<>c.<QueueUserWorkItem>b__5_0(System.Object)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef, System.Threading.Thread)

@dotnet/jit-contrib

Author: BruceForstall
Assignees: -
Labels:

arch-arm64, os-windows, JitStress, area-CodeGen-coreclr

Milestone: 7.0.0

@kunalspathak
Copy link
Member

Could this be related to recent changes in this area? I am able to repro this even without the jitstress flags.

This was recently ported to Windows in #72051 and seems related to that.

Here is the call stack on windows/arm64

00 MSDTCPRX!CIResourceManagerFactory::CreateEx
01 xunit_console!ILStubClass.IL_STUB_CLRtoCOM(System.Guid, System.String, System.Transactions.DtcProxyShim.DtcInterfaces.IResourceManagerSink, System.Guid, System.Object ByRef)
02 System_Transactions_Local!System.Transactions.DtcProxyShim.DtcProxyShimFactory+<>c__DisplayClass10_0.<ConnectToProxy>b__2()
03 System_Transactions_Local!System.Transactions.DtcProxyShim.OletxHelper.Retry(System.Action)
04 System_Transactions_Local!System.Transactions.DtcProxyShim.DtcProxyShimFactory.ConnectToProxy(System.String, System.Guid, System.Object, Boolean ByRef, Byte[] ByRef, System.Transactions.DtcProxyShim.ResourceManagerShim ByRef)
05 System_Transactions_Local!System.Transactions.Oletx.DtcTransactionManager.Initialize()
06 System_Transactions_Local!System.Transactions.Oletx.DtcTransactionManager.get_ProxyShimFactory()
07 System_Transactions_Local!System.Transactions.Oletx.OletxTransactionManager.CreateTransaction(System.Transactions.TransactionOptions)
08 System_Transactions_Local!System.Transactions.TransactionStatePromoted.EnterState(System.Transactions.InternalTransaction)
09 System_Transactions_Local!System.Transactions.EnlistableStates.EnlistDurable(System.Transactions.InternalTransaction, System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions, System.Transactions.Transaction)
0a System_Transactions_Local!System.Transactions.Transaction.EnlistDurable(System.Guid, System.Transactions.IEnlistmentNotification, System.Transactions.EnlistmentOptions)
0b System_Transactions_Local_Tests!System.Transactions.Tests.OleTxTests+OleTxFixture..ctor()
0c coreclr!CallDescrWorkerInternal

image

image

@kunalspathak kunalspathak added area-System.Transactions and removed JitStress CLR JIT issues involving JIT internal stress modes area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Aug 18, 2022
@roji
Copy link
Member

roji commented Aug 18, 2022

Will look at this ASAP

@roji roji self-assigned this Aug 18, 2022
@roji
Copy link
Member

roji commented Aug 18, 2022

/cc @AaronRobinsonMSFT

@kunalspathak
Copy link
Member

I would be curious to know why we don't see these failures in other CI pipelines.

@AaronRobinsonMSFT
Copy link
Member

@roji I may have missed the by-ref for the GUIDs in this case. In fact, now that I think about it that is the issue. I'd update the interfaces that take a GUID* or a REFGUID to be defined as in Guid or ref Guid depending on the desired semantics.

@roji
Copy link
Member

roji commented Aug 18, 2022

@AaronRobinsonMSFT thanks, I'll prepare a PR for that tomorrow. Though the debugging session above seems to show that ppvResMgr is the source of the crash, which seems very odd (it's just an out parameter on the .NET side). And yeah, like @kunalspathak I'm wondering why this hasn't shown up before in CI...

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 19, 2022
@steveisok steveisok added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Aug 24, 2022
@karelz
Copy link
Member

karelz commented Aug 25, 2022

@roji this is heavily impacting CI -- 6x failures per day. Can you please merge the fix PR #74226, or at least disable the tests ASAP? (incl. 7.0 branch) Thanks!

@roji roji changed the title Test failure: System.Transactions.Local.Tests Test failure: System.Transactions.Local.Tests on arm64 Aug 25, 2022
@roji
Copy link
Member

roji commented Aug 25, 2022

I can merge #74226, but @kunalspathak indicated above that the failure still occurs. Looking at the error again, this occurs here, one of the 1st interop calls done; ConnectToProxy is first thing that ever does interop in Sys.Tx. The stack trace in #74170 seems to indicate that the last parameter, ppvResMgr, points to an invalid address; since this is a plain .NET out variable, this seems like it could only happen if there's some issue with the signature of the CreateEx method (signature in this PR, docs), or something more low level. @AaronRobinsonMSFT note that there's interface inheritance going on here (IResourceManagerFactory.Create and IResourceManagerFactory2.CreateEx), in case that's related.

With this happening only on arm64, and even there not deterministically, I'm out of my depth here... I've pushed a change here to disable the tests on ARM to unblock the build, and will merge. Hopefully we can resolve the issue soon and reenable.

github-actions bot pushed a commit that referenced this issue Aug 25, 2022
And disable tests on ARM for now.

Works around #74170
github-actions bot pushed a commit that referenced this issue Aug 29, 2022
NonMsdtcPromoterTests.PSPENonMsdtcGetPromoterTypeMSDTC was triggering
an MSDTC distributed transaction on Windows, but without the proper
checks/resiliency. Moved to OleTxTests.

Fixes #74170
roji added a commit that referenced this issue Aug 29, 2022
NonMsdtcPromoterTests.PSPENonMsdtcGetPromoterTypeMSDTC was triggering
an MSDTC distributed transaction on Windows, but without the proper
checks/resiliency. Moved to OleTxTests.

Fixes #74170
@roji roji reopened this Aug 29, 2022
carlossanlop pushed a commit that referenced this issue Aug 30, 2022
NonMsdtcPromoterTests.PSPENonMsdtcGetPromoterTypeMSDTC was triggering
an MSDTC distributed transaction on Windows, but without the proper
checks/resiliency. Moved to OleTxTests.

Fixes #74170

Co-authored-by: Shay Rojansky <[email protected]>
@roji
Copy link
Member

roji commented Sep 6, 2022

@kunalspathak @AaronRobinsonMSFT how can we make progress on this arm64-specific issue? @kunalspathak neither @AaronRobinsonMSFT nor I have access to an arm64 machine - are you able to help us iterate on this as per #74170 (comment)? I'm worried not so much about arm64 distributed transaction support, but more that there's some deeper bug in interop there, since the interop signatures seem to be fine - so we may want to prioritize this as a general arm64 bug. Is there someone we should loop in on this?

If we think we can't manage to investigate this in time, I'll go ahead and a runtime check aganist arm64, so that users get a clean PlatformNotSupportedException right away rather than a non-deterministic AccessViolationException.

@danmoseley
Copy link
Member

I'm marking blocking-release so this stays on the radar.

jkoritzinsky added a commit to jkoritzinsky/runtime that referenced this issue Sep 7, 2022
According to [the docs](https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms681318(v=vs.85)) and the Windows SDK headers, the Guid parameters here are all passed by-ref.

Update the definition of the interface to pass the `Guid` parameters with `in` to match the native signature.

Fixes dotnet#74170
@roji roji changed the title Test failure: System.Transactions.Local.Tests on arm64 Distributed transactions: AccessViolationException on arm64 Sep 8, 2022
roji added a commit to roji/runtime that referenced this issue Sep 9, 2022
And fix GUID interop in distributed transactions

See dotnet#74170
@roji
Copy link
Member

roji commented Sep 9, 2022

See update in #74570 (comment) - I propose to block ARM for rc2 and am updating PRs #74226 and #74570 to do so. We can continue investigating in parallel, of course.

roji added a commit that referenced this issue Sep 9, 2022
And fix GUID interop in distributed transactions

See #74170

(cherry picked from commit fdfef13)
roji added a commit that referenced this issue Sep 11, 2022
And fix GUID interop in distributed transactions

See #74170
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 11, 2022
@karelz
Copy link
Member

karelz commented Sep 12, 2022

@roji should we re-enable the tests and remove links to this issue? https://github.com/dotnet/runtime/search?q=74170
Or should the linked issue be something that is opened and tracking arm64 support, or will it never be supported?

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Sep 12, 2022
@roji roji reopened this Sep 12, 2022
@roji
Copy link
Member

roji commented Sep 12, 2022

@karelz no, unfortunately the issue still persists... My proposal was to block distributed transactions on arm64 because of this, but we may want to allow it instead (see #74570 (comment)). In any case, the intention is to get to the bottom of this post-rc2, possibly post-7.0.

carlossanlop pushed a commit that referenced this issue Sep 13, 2022
And fix GUID interop in distributed transactions

See #74170

(cherry picked from commit fdfef13)

Co-authored-by: Shay Rojansky <[email protected]>
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 13, 2022
@roji
Copy link
Member

roji commented Sep 22, 2022

Due to a previous mis-communication, it was believed that we're still seeing AccessViolationException even after some interop fixes (see #74226). However, we're no longer able to reproduce the AccessViolationException. arm64 support has been re-enabled in #75703, including all testing.

Closing this, will reopen if we see AccessViolationException again.

@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Oct 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-System.Transactions disabled-test The test is disabled in source code against the issue os-windows
Projects
None yet
7 participants