Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve SystemTests execution time #253

Merged
merged 11 commits into from
Oct 14, 2024
Merged

Improve SystemTests execution time #253

merged 11 commits into from
Oct 14, 2024

Conversation

obligaron
Copy link
Contributor

🤔 What's changed?

This PR improves the execution time of Reqnroll.SystemTests.

This is mainly done by two changes.

1. Use global nuget cache by default

Currently the execution of SystemTests uses a custom local nuget cache.
This results in ~600MB being copied on each SystemTests run (once for all tests).
This is necessary because we create a new nuget package with the same version after each build.
But the global nuget cache doesn't allow to remove a specific nuget package.

Based on https://www.nyckel.com/blog/nuget-packages a custom build target has been introduced to remove the Reqnroll nuget packages from the nuget cache. This allows us to still use the default nuget cache and avoid copying ~600MB.

This results in a speedup from ~1.4 minutes to ~1.2 minutes on my machine.

The old behavior can be enabled by a setting (PerRunNuGetPackages config switch).

2. Allow method-level parallelization

This change allows multiple tests in a class to run in parallel.

This results in a speedup from ~1.2 minutes to ~51 seconds on my machine.

⚡️ What's your motivation?

When contributing

🏷️ What kind of change is this?

  • 🏦 Refactoring/debt/DX (improvement to code design, tooling, etc. without changing behaviour)

♻️ Anything particular you want feedback on?

I'm not sure if this change should be included in the CHANGELOG.md since it only affects Reqnroll contributors.

Also, during testing, I had some sporadic occurrence of #249,

📋 Checklist:

  • Users should know about my change
    • I have added an entry to the "[vNext]" section of the CHANGELOG, linking to this pull request & included my GitHub handle to the release contributors list.

@obligaron obligaron changed the title Inprove SystemTests execution time Improve SystemTests execution time Sep 8, 2024
@gasparnagy
Copy link
Contributor

Oh... Thx for taking care of this...

Based on https://www.nyckel.com/blog/nuget-packages a custom build target has been introduced to remove the Reqnroll nuget packages from the nuget cache. This allows us to still use the default nuget cache and avoid copying ~600MB.

I have also tried that earlier, but dropped because if you have other Visual Studio instances open that use Reqnroll (e.g. exploratory test projects) then the folder deletion is sometimes not possible ("access denied"), because Visual Studio somehow puts a lock on the dll-s from the package. This is pretty random, but happens time to time. If we would simply swallow this error, it would cause some unpredictability but if we let it fail, it will cause build failures (even if you don't run the tests). So my decision was to undo this. 😢 (The speed difference is not that significant.)

(The method level parallelization is fine.)

Maybe what we could do is a mixed approach: we keep a persistent duplicated nuget cache only for the system tests and we only delete our packages from that one. That cache is obviously not used by other VS instances so there will be no lock. What do you think?

@obligaron
Copy link
Contributor Author

When does the lock happen on your machine?
I also sometimes have these locks when I open a project that is generated by the SystemTests (to debug something). But when I close these Visual Studio instances, the lock is gone. Are these the same locks you encountered?

I ask this way because the lock only happens when I use the SystemTests, so a separate NuGet cache folder for all SystemTests would probably not help.

If these are the same locks, I currently see three options

  1. Tolerate the behavior (merge this PR) and advise a workflow such as open a testproject (.sln) and other finish with the debug sessions close the Visual Studio instance.
  2. Support different Reqnroll NuGet packages with the same version. This would probably require a separate nuget cache folder for each SystemTests run. We could optimize this by copying everything we expect from Reqnroll from the global nuget cache folder or from an old SystemTests run. At least then we wouldn't need a stable network connection and could run the SystemTests offline (after the first run).
  3. Try to create local NuGet packages with a timestamp or something like that (e.g. instead of -local we would have -local.20240921115131). This way we logically have different NuGet packages after each run and can use the global nuget cache folder. We would probably have to try to delete the local NuGet cache from the local ReqnRoll NuGet packages to free up disk space. But if that fails, we can ignore it.

@gasparnagy
Copy link
Contributor

@obligaron No. In my case I see the error if I have a normal Reqnroll sample project open in Visual Studio, not the System Tests one. In my case it particularly locks the Reqnroll.MsTest, Reqnroll.NUnit or the Reqnroll.xUnit packages (I mean the dll-s inside).

If it would only happen for debugging the projects generated by system tests, I think that would be acceptable (I usually open these projects in VSCode anyway), but it affects all others. This is why I thought that just having one duplicate for all system tests would be beneficial.

@obligaron
Copy link
Contributor Author

Interesting.

On my machine it also happens that Visual Studio locks the Reqnroll.MsTest folder.
For example I use a old Reqnroll 1.0.0 Version in Visual Studio.
I can't rename or delete \.nuget\packages\reqnroll.mstest\1.0.0.
But I can rename or delete \.nuget\packages\reqnroll.mstest\2.1.1-local.

Perhaps previous the deletion wasn't version specific and it tried to delete \.nuget\packages\reqnroll.mstest. This would fail. 🤔

Can reproduce it with this PR? If yes, I can look into using a persistent duplicated nuget package.

@gasparnagy
Copy link
Contributor

@obligaron I will try to reproduce it and give feedback.

@gasparnagy
Copy link
Contributor

I can reproduce it with your version... 😥

5>------ Rebuild All started: Project: Reqnroll.Tools.MsBuild.Generation, Configuration: Debug Any CPU ------
5>Reqnroll.Tools.MsBuild.Generation -> W:\Reqnroll\Reqnroll\Reqnroll.Tools.MsBuild.Generation\bin\Debug\net462\Reqnroll.Tools.MsBuild.Generation.dll
5>Reqnroll.Tools.MsBuild.Generation -> W:\Reqnroll\Reqnroll\Reqnroll.Tools.MsBuild.Generation\bin\Debug\netstandard2.0\Reqnroll.Tools.MsBuild.Generation.dll
5>Successfully created package 'W:\Reqnroll\Reqnroll\Reqnroll.Tools.MsBuild.Generation\bin\Debug\Reqnroll.Tools.MsBuild.Generation.2.1.1-local.nupkg'.
5>W:\Reqnroll\Reqnroll\Directory.Build.targets(38,5): error MSB3231: Unable to remove directory "C:\Users\gaspar\.nuget\packages\/reqnroll.tools.msbuild.generation/2.1.1-local". Access to the path 'Reqnroll.dll' is denied.
5>Done building project "Reqnroll.Tools.MsBuild.Generation.csproj" -- FAILED.
========== Rebuild All: 4 succeeded, 1 failed, 0 skipped ==========
========== Rebuild completed at 17:08 and took 08.523 seconds ==========

What I did:

  • Open the CleanReqnrollProject.sln from the repo https://github.com/reqnroll/Reqnroll.ExploratoryTestProjects
  • This repo uses the 2.1.1-local version of the packages (as it is used to exploratory test the current local version)
  • Opened feature files, run & debug tests a few times, close & reopen solution
  • Rebuild Reqnroll Reqnroll.Tools.MsBuild.Generation project
  • These steps will cause the access denied error approx 1/3 of the cases (I usually have to do it 2-3 times to get the error)

Of course this is a special situation, because normally no one is using the -local version of the packages, but it is painful still.

To be honest, the current status is also not ideal, because currently I have to manually delete the packages from the packages folder if I want to re-test the stuff with my exploratory tests (and sometimes close VS because of the locks). But in many cases I only modify the Reqnroll runtime, so I only need to delete the Reqnroll package, so I am not affected with the locked Reqnroll.Tools.MsBuild.Generation package...

Let me test the System Test execution as well -> I put it to another comment.

@gasparnagy
Copy link
Contributor

With "normal" runs, the System Tests execution is fine.

I use NCrunch that runs the tests isolated in the background. Normally when I build, it realizes that the build output has changed and restarts those tests. With your change this process is different, because I get a build error, so I cannot build the project. I have to first stop the test execution, build and restart the execution. But I can configure NCrunch to set the REQNROLL_TEST_PerRunNuGetPackages environment variable to true for the execution and then it works as before (with the penalty of the cloned packages folder, of course).

@gasparnagy
Copy link
Contributor

Thinking about this a bit more... Maybe all-in-all your optimization has more advantages than disadvantages.

Maybe we could make some optimizations:

  • One of the problem is that when it tries to delete the packages and fails, the package remains in a half-deleted state and it causes very strange errors if you don't realize what is wrong. Could we maybe do the deletion in a way, that we first just rename the folder (e.g. 2.1.1-local-remove-XYZ where XYZ is just some random text) and only delete if the rename succeeds. Rename is (hopefully) more "atomic". (Or if we want to skip the XYZ part, we would need to first delete 2.1.1-local-remove if exists, then rename the current to 2.1.1-local-remove and then delete the 2.1.1-local-remove.)
  • Could we maybe introduce another environment variable 😎 and if that is set, the target that suppose to rename/delete the folder would just cause a warning and not an error? Because then I (probably the only person who uses the exploratory test projects that extensively) could set that variable locally and I would not be affected.

What do you think?

@gasparnagy
Copy link
Contributor

Another note (independent of all the others): Two tests from the Reqnroll.TestProjectGenerator.Tests.ProjectTests currently fails (CreateEmptyCSharpCore3_1ProjectInNewFormat and CreateEmptyCSharpNet50ProjectInNewFormat) with '--no-update-check' is not a valid option. As we don't support these frameworks anyway, feel free to delete these two tests.

@obligaron
Copy link
Contributor Author

Of course this is a special situation, because normally no one is using the -local version of the packages, but it is painful still.

Sorry for the misunderstanding. I thought you meant that other versions of reqnroll can be accessed (not the newly build version).

To be honest, the current status is also not ideal, because currently I have to manually delete the packages from the packages folder if I want to re-test the stuff with my exploratory tests (and sometimes close VS because of the locks).

With the new behavior in the good case the manual deletion could be avoided. This would also ensure that the nuget cache and the nuget packages stay in sync.

One of the problem is that when it tries to delete the packages and fails, the package remains in a half-deleted state and it causes very strange errors if you don't realize what is wrong. Could we maybe do the deletion in a way, that we first just rename the folder (e.g. 2.1.1-local-remove-XYZ where XYZ is just some random text) and only delete if the rename succeeds. Rename is (hopefully) more "atomic". (Or if we want to skip the XYZ part, we would need to first delete 2.1.1-local-remove if exists, then rename the current to 2.1.1-local-remove and then delete the 2.1.1-local-remove.)

Sadly at least windows don't allow to rename the folder when a file is locked. 😢

Could we maybe introduce another environment variable 😎 and if that is set, the target that suppose to rename/delete the folder would just cause a warning and not an error? Because then I (probably the only person who uses the exploratory test projects that extensively) could set that variable locally and I would not be affected.

I introduced REQNROLL_ENSURE_NUGET_CACHE_CLEARED. If the variable is set to false MsBuild will only print a warning. 🙂

Another note (independent of all the others): Two tests from the Reqnroll.TestProjectGenerator.Tests.ProjectTests currently fails (CreateEmptyCSharpCore3_1ProjectInNewFormat and CreateEmptyCSharpNet50ProjectInNewFormat) with '--no-update-check' is not a valid option. As we don't support these frameworks anyway, feel free to delete these two tests.

Thanks. I removed them. 👍

@obligaron
Copy link
Contributor Author

Btw: If we revive #64 (which was reversed because of #152), we could avoid the locks in visual studio.
Maybe we could make #64 conditional for windows only?

@gasparnagy
Copy link
Contributor

Thx for the additions.

On reviving #64... this might be a good idea. Let's consider this separately.

Sadly at least windows don't allow to rename the folder when a file is locked. 😢

Yes, that is the point. Maybe I was not clear enough. When it is locked you cannot rename, so it is fully untouched. But with the current behavior, it is deleting half of the files and it stops at deleting the .dll, so what "remains" is a half-existing package in the cache - dotnet restore will not "re-download" it, but it fails because some files are missing.

@obligaron
Copy link
Contributor Author

If we want to ensure that the deletion works, we would probably need something like you suggested ((check old renamed folder exists and delete if present) -> rename -> delete).
But this would require a more complex solution (running custom c# code in msbuild), I'm not sure it's worth the investment because every time the nuget packages are deleted but the cache is not cleared we also have some incorrect/invalid state.
If we are going to go this route would I look at something like UsingTask with RoslynCodeTaskFactory or is there another approach?

@gasparnagy
Copy link
Contributor

gasparnagy commented Oct 4, 2024

@obligaron Why do you think it would need a more complex solution?

What if we would just do a

  • <RemoveDir Directories="<ourdir>-remove" Condition="Exists('<ourdir>-remove')" />
  • <Move SourceFiles="<ourdir>" DestinationFiles="<ourdir>-remove" ContinueOnError="'$(REQNROLL_ENSURE_NUGET_CACHE_CLEARED)' == 'false'" /> (no ContinueOnError, unless env variable is set!)
  • <RemoveDir Directories="<ourdir>-remove" ContinueOnError="true" />

Or do I miss something?

@obligaron
Copy link
Contributor Author

I missed that I could check that the rename directory exists in Condition. 😉

So I tried to implement this, but I failed because the Move Task doesn't support moving/renaming directories.
Also MsBuild Property functions doesn't support Directory.Move. 😢

So I tired to implement it with UsingTask and RoslynCodeTaskFactory. This looks to work (at least on my machine 😁). I added it as last commit.

@gasparnagy
Copy link
Contributor

@obligaron oh cool. i give it a try (probably only on Friday) and give feedback.

* main:
  Bump version
  VS Code integration  proper doc title (#280)
  Fix: Reqnroll generates invalid code for rule backgrounds in Visual Basic (#284)
  Update nunit.md (#276)
  SolutionTests: Check if SDK version is installed and if not ignore the test (#266)
  Fix for #271 This PR modifies the Feature AST visitor to appropriately handle Rule Background steps. (#272)
  Bugfix/fix reqnroll.verify parallelization (#255)
@gasparnagy
Copy link
Contributor

@obligaron I played with this and found funny things... It seems that in some situations Windows is able to rename the folder even with a lock on it, but then could not delete the renamed files... So I needed to add a bit more try/catch to the solution to somehow handle these. But altogether it works.

I also played with re-introducing the TaskFactory="TaskHostFactory" and it seems that that would indeed solve my exploratory testing issue, but I keep that as a separate PR as that could affect end users as well.

@gasparnagy gasparnagy merged commit 94b77d8 into main Oct 14, 2024
4 checks passed
@gasparnagy gasparnagy deleted the systemtestsspeed branch October 14, 2024 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants