Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent failing because of missing Microsoft.WindowsDesktop.App folder #450

Open
kevingosse opened this issue May 24, 2022 · 3 comments
Open

Comments

@kevingosse
Copy link
Contributor

For months, our Crank agent have been plagued with an issue where all jobs suddenly start failing with the message:

dotnet-install could not install a component: 

After restarting the crank agent, the error mysteriously disappears until the next time.

I finally took some time to dig into it, and I found the root exception:

0:000> !pe 12ae1416148
Exception object: 0000012ae1416148
Exception type:   System.IO.DirectoryNotFoundException
Message:          Could not find a part of the path 'C:\Windows\TEMP\benchmarks-agent\benchmarks-server-4776\bnwqqose.b5j\shared\Microsoft.WindowsDesktop.App'.
InnerException:   <none>
StackTrace (generated):
    SP               IP               Function
    0000000E6D8FD450 00007FF7F5FCC6EC System_Private_CoreLib!System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib]].CreateDirectoryHandle(System.String, Boolean)+0xbc
    0000000E6D8FD4B0 00007FF7F5FCC534 System_Private_CoreLib!System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib]].Init()+0x24
    0000000E6D8FD510 00007FF7F608BF1D System_Private_CoreLib!System.IO.Enumeration.FileSystemEnumerableFactory.UserDirectories(System.String, System.String, System.IO.EnumerationOptions)+0x12d
    0000000E6D8FD560 00007FF7F605770E System_Private_CoreLib!System.IO.Directory.InternalEnumeratePaths(System.String, System.String, System.IO.SearchTarget, System.IO.EnumerationOptions)+0x8e
    0000000E6D8FD5A0 00007FF7F608BD47 System_Private_CoreLib!System.IO.Directory.GetDirectories(System.String)+0x47
    0000000E6D8FD5E0 00007FF7F5AB62DD crank_agent!Microsoft.Crank.Agent.Startup.SeekCompatibleDesktopRuntime(System.String, System.String, System.String)+0x5d
    0000000E6D8FD650 00007FF7F5A68BE1 crank_agent!Microsoft.Crank.Agent.Startup+<CloneRestoreAndBuild>d__78.MoveNext()+0x2061

My understanding is that Crank tries to install the .net runtime (

if (!beforeDesktop.Contains(targetFramework))
) and it sometimes fails (I assume because of a transient network error). When that happens, the version is added to an ignore list, then SeekCompatibleDesktopRuntime is used as fallback:

                        // Record that we don't need to try to download this version next time
                        _ignoredDesktopRuntimes.Add(desktopVersion);

                        // if the specified SDK can't be installed

                        // Seeking already installed Desktop runtimes
                        // c.f. https://github.com/dotnet/sdk/issues/4237

                        desktopVersion = SeekCompatibleDesktopRuntime(dotnetHome, targetFramework, desktopVersion);

And SeekCompatibleDesktopRuntime fails because of the aforementioned error (Could not find a part of the path 'C:\Windows\TEMP\benchmarks-agent\benchmarks-server-4776\bnwqqose.b5j\shared\Microsoft.WindowsDesktop.App'.).

Past this point, all jobs will fail because the target version has been added to _ignoredDesktopRuntimes so the agent will always use the bogus fallback.

I'm not sure what's the best way to fix it, but I would have a few recommendations:

  • The job error should mention the exception, for easier debugging:
    job.Error = $"dotnet-install could not install a component: {dotnetInstallStep}";
  • The version of the framework shouldn't be added to _ignoredDesktopRuntimes if the cause of the failure is a network error (assuming that can be detected)
  • SeekCompatibleDesktopRuntime should check if the Microsoft.WindowsDesktop.App directory exists

But I'm still failing to understand why this directory doesn't exist to begin with. Maybe it has something to do with dotnet/sdk#4237?

@sebastienros
Copy link
Member

This Windows Desktop part of the sdk is a pain. But I should be able to make it more resilient and not break subsequent runs if it fails. Also need to check if it's already doing some retries.

Are you using floating versions of the SDKs (7.0 previews)?
Do you know that there is an argument on the command line to re-use a folder containing pre-installed runtimes? At least if you use the same versions it won't try to download it again when the agent is restarted.

@kevingosse
Copy link
Contributor Author

Are you using floating versions of the SDKs (7.0 previews)?

We're targeting the latest 5.0: <TargetFramework>net5.0</TargetFramework>

Do you know that there is an argument on the command line to re-use a folder containing pre-installed runtimes? At least if you use the same versions it won't try to download it again when the agent is restarted.

I didn't know that. It wouldn't fix the underlying issue but that would be a massive improvement. I assume you're refering to --dotnethome?

@sebastienros
Copy link
Member

Yes, dotnethome, and if you are using net5.0 then it will prevent most of the downloads on startup.
NB: time to upgrade, but you know ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants