Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: Exclude Semmle.* dlls when using the executing runtime. #15993

Merged
merged 10 commits into from
Apr 10, 2024

Conversation

michaelnebel
Copy link
Contributor

@michaelnebel michaelnebel commented Mar 20, 2024

In this PR we

  • Introduce a new container class AssemblyPath for paths to assemblies (or folders containing assemblies) and make it possible to add a filter function for relevant filenames on the path.
  • Introduce logic that exclude the Semmle.* dlls from being included in the dependencies.
  • Add an integration tests that uses the execution runtime.

@github-actions github-actions bot added the C# label Mar 20, 2024
@michaelnebel michaelnebel force-pushed the csharp/assemblycachefiltering branch from 8fa54a0 to 8fb924f Compare April 5, 2024 08:01
@michaelnebel michaelnebel force-pushed the csharp/assemblycachefiltering branch 5 times, most recently from afdb6eb to 5a70f70 Compare April 8, 2024 14:18
@michaelnebel michaelnebel added the no-change-note-required This PR does not need a change note label Apr 8, 2024
@michaelnebel michaelnebel marked this pull request as ready for review April 9, 2024 07:40
@michaelnebel michaelnebel requested a review from a team as a code owner April 9, 2024 07:40
Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some minor comments.

Additionally, we could consider extending the logic in AssemblyPath, which currently covers two cases:

  • file path
  • all files from a directory (recursively)

A third case would be

  • all files from a directory (only top level)

We're already using this latter case in AddFrameworkDlls.

}
else
{
logger.LogInfo("AssemblyCache: Path not found: " + p);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste issue: AssemblyCache vs AssemblyPath

@@ -71,7 +71,7 @@ public DependencyManager(string srcDir, ILogger logger)
this.generatedSources = new();
var allProjects = allNonBinaryFiles.SelectFileNamesByExtension(".csproj").ToList();
var allSolutions = allNonBinaryFiles.SelectFileNamesByExtension(".sln").ToList();
var dllPaths = allFiles.SelectFileNamesByExtension(".dll").ToHashSet();
var dllPaths = allFiles.SelectFileNamesByExtension(".dll").Select<string, AssemblyPath>(x => x).ToHashSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an opinion: I find .Select<string, AssemblyPath>(x => x) difficult to read, I think .Select(x => new AssemblyPath(x)) is easier to understand.

{
logger.LogInfo("AssemblyCache: Path not found: " + path);
}
path.Process(dllsToIndex, logger);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Process call could be changed to simply return a list of DLL paths, and then the return value could be used: dllsToIndex.AddRange(path.GetFilePaths(logger)). This way the AssemblyPath is more self contained as we're not pushing "state" into it.

/// Used to represent a path to an assembly or a directory containing assemblies
/// and a selector function to determine which files to include, when indexing the assemblies.
/// </summary>
internal sealed class AssemblyPath(string p, Func<string, bool> includeFileName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe AssemblyLookupLocation is a somewhat better name for this, as it's not necessarily an assembly path that we're storing here.

@michaelnebel
Copy link
Contributor Author

I will extend the scope a bit more on this PR.

@michaelnebel michaelnebel force-pushed the csharp/assemblycachefiltering branch from 5a70f70 to ff498f6 Compare April 10, 2024 11:21
@michaelnebel
Copy link
Contributor Author

michaelnebel commented Apr 10, 2024

@tamasvajk : I Discovered a difference in the file enumeration (which can be contributed to your tireless effort of making proper integration test coverage - otherwise it would definitely have been missed): It turns out that when using enumeration options to enumerate files with the EnumerateFiles, then hidden files and folders are excluded by default (contrary to EnumerateFiles(pattern, SearchOption.AllDirectories)). This means that folders like .NETFramework are excluded as everything starting with . is perceived as hidden. This has been fixed at least for the dll indexing. This means that the dependency manager in general might also exclude hidden files and folders. Should we consider to change that?

@michaelnebel
Copy link
Contributor Author

With this change we might be able to collapse dllLocations and frameworkLocations but it will require some work as the asset detection needs to properly identify framework dependencies.

@michaelnebel michaelnebel changed the title C#: Introduce AssemblyPath and re-factor AssemblyCache to use this in… C#: Exclude Semmle.* dlls when using the executing runtime. Apr 10, 2024
@michaelnebel michaelnebel requested a review from tamasvajk April 10, 2024 11:46
@tamasvajk
Copy link
Contributor

This means that the dependency manager in general might also exclude hidden files and folders. Should we consider to change that?

Hmm, interesting finding. For the time being, I wouldn't change this. My fear is that it might have significant impact on the file operations that we do. For example, reading all the files in the .git folder might be slow.

Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I've added some minor comments.

/// Used to represent a path to an assembly or a directory containing assemblies
/// and a selector function to determine which files to include, when indexing the assemblies.
/// </summary>
internal sealed class AssemblyLookupLocation(string p, Func<string, bool> includeFileName, bool indexSubdirectories = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe use path instead of p.

/// and adds them to the a list of assembly names to index.
/// Indexing is performed at a later stage. This only collects the names.
/// </summary>
/// <param name="dir">The directory to index.</param>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters have diverged from the comment.

@@ -0,0 +1,5 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is probably not needed for this test.

Comment on lines +50 to +53
catch (Exception e)
{
logger.LogError($"AssemblyLookupLocation: Error while searching for DLLs in '{path}': {e.Message}");
}

Check notice

Code scanning / CodeQL

Generic catch clause Note

Generic catch clause.
Copy link
Contributor

@tamasvajk tamasvajk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@michaelnebel michaelnebel merged commit b79d738 into github:main Apr 10, 2024
16 checks passed
@michaelnebel michaelnebel deleted the csharp/assemblycachefiltering branch April 10, 2024 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C# no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants