-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(Analyzer): Add a flag to skip excluded projects / scopes in dependency graph #6405
feat(Analyzer): Add a flag to skip excluded projects / scopes in dependency graph #6405
Conversation
I'd like to just point out #5968. It would be great if this work could fully address that issue. |
@fviernau, @sschuberth: This is currently a bit experimental. What I am after is indeed not a full solution for #5968, but only a first step: We have currently problems with an Android project that frequently crashes our ORT pipeline due to resource exhaustion. In the cases the build runs through, the Analyzer step takes more than 2 hours. The time seems to be consumed mainly when constructing the dependency graph. I did some tests in which I filtered out test scopes when adding the dependencies to the graph builder - this made the build stable and reduced the time of the Analyzer step to 8 minutes! So, I think, this intermediate step to skip excluded scopes when building the dependency graph is already beneficial - and the implementation should not be too complex. WDYT? |
Feels a bit like cheating 😉 More seriously, I'd really like to understand what exactly is taking so long to construct the dependency graph. Also, I wouldn't expect test scopes to have much more dependencies than "regular" scopes, so why does excluding them make such a big difference? |
1c9ff9a
to
c023700
Compare
Fair points. What we have noticed is that, starting from a certain size, adding further dependencies to the graph becomes rather slow. This is probably due to the fact that the subtrees that need to be compared become larger and larger. For Android projects with their numerous synthetic scopes this seems to be a real issue; so reducing the overall size of the graph by excluding unneeded scopes can actually have a significant effect. Regarding "cheating": I think, handling scope excludes when constructing the dependency graph needs to be done anyway when implementing full support for exclusions as requested by #5968. So, these changes are not a side-track, but already a partial solution to this goal. I have pushed some more commits, so hopefully it becomes a bit clearer now what I try to achieve. |
a5279a2
to
1d4db04
Compare
c4eeb22
to
bb2c739
Compare
I have extended the PR to cover the functionality requested by #5968. Still testing whether there is something missing. |
af9c053
to
c675c27
Compare
c675c27
to
8ad39d4
Compare
analyzer/src/funTest/assets/projects/external/git-repo-expected-output.yml
Outdated
Show resolved
Hide resolved
analyzer/src/funTest/assets/projects/synthetic/gradle-expected-output-scopes-excludes.yml
Show resolved
Hide resolved
8ad39d4
to
5c1520e
Compare
*/ | ||
fun findManagedFiles( | ||
directory: File, | ||
packageManagers: Collection<PackageManagerFactory> = ALL.values | ||
packageManagers: Collection<PackageManagerFactory> = ALL.values, | ||
excludes: Excludes = EMPTY_EXCLUDES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PackageManager
has an instance of RepositoryConfiguration
containing all excludes.
So, in case --skip-excluded
is passed, the excludes are passed redundantly. Would it make sense to pass a Boolean
/ flag instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
findManagedFiles()
is a function of the companion object. So the RepositoryConfiguration
is not available here.
cb0f0c7
to
a49e6d8
Compare
It is not clear to me why the commit-lint check fails. Could anybody help me understand what is wrong? |
For some reason it seems to believe |
The comma was not sufficient, obviously it has problems with the '#' character. I now used the reference style. |
f12d66f
to
89b28b7
Compare
analyzer/src/funTest/assets/projects/external/git-repo-expected-output.yml
Outdated
Show resolved
Hide resolved
This flag determines whether the excludes defined in the repository configuration should lead to paths being skipped completely during analysis. This means, if set to true, only definition files in non-excluded paths are processed, and dependencies in non-excluded scopes are added to the dependency graph. Signed-off-by: Oliver Heger <[email protected]>
This is analogous to other model classes. The constant can be used where an `Excludes` object is required, but no elements need to be excluded. Signed-off-by: Oliver Heger <[email protected]>
This property contains the excludes to be applied during analysis, which depend on both the analyzer configuration and the configuration of the current repository. Having this property available centrally simplifies the implementation of concrete PackageManagers. Signed-off-by: Oliver Heger <[email protected]>
Before adding a dependency to the dependency graph builder, make sure that the scope is not excluded if the 'skipExcluded' flag is set. Signed-off-by: Oliver Heger <[email protected]>
Before adding a dependency to the dependency graph builder, make sure that the scope is not excluded if the 'skipExcluded' flag is set. Signed-off-by: Oliver Heger <[email protected]>
Before adding a dependency to the dependency graph builder, make sure that the scope is not excluded if the 'skipExcluded' flag is set. Signed-off-by: Oliver Heger <[email protected]>
Before adding a dependency to the dependency graph builder, make sure that the scope is not excluded if the 'skipExcluded' flag is set. Signed-off-by: Oliver Heger <[email protected]>
The convert() function now accepts an Excludes object. The result of the conversion contains only dependencies not defined by an excluded scope. This functionality is going to be used to implement scope exclusions centrally for package managers that do not support the dependency graph format natively, see [1]. [1] oss-review-toolkit#3825 Signed-off-by: Oliver Heger <[email protected]>
Pass the configured `Excludes` to the dependency graph converter when constructing the analyzer result. That way, scope excludes are effective for all package manager implementations that do not support the dependency graph format natively. Signed-off-by: Oliver Heger <[email protected]>
Signed-off-by: Oliver Heger <[email protected]>
This allows skipping projects in specific paths already before starting the analysis. That way problematic / large projects that are irrelevant compliance-wise can be excluded leading to reduced resource usage and analysis time. Signed-off-by: Oliver Heger <[email protected]>
If the analyzer is configured to skip excludes, it passes the Excludes defined in the repository configuration to PackageManager.findManagedFiles(). That way the analyzer result will contain only projects that are not matched by a path exclude. Resolves oss-review-toolkit#5968. Signed-off-by: Oliver Heger <[email protected]>
89b28b7
to
414be04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I'd prefer to have at least one other approval from @oss-review-toolkit/kotlin-devs.
@@ -338,55 +338,55 @@ class Yarn2( | |||
mapping += it.value | |||
} | |||
} | |||
graphBuilder.addPackages(allPackages.values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've removed this, but you aren't changing anything below aside from applying the filter.
Was this never required, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This was never required, since the calls to addDependency()
automatically add the packages to the graph builder. This would only be needed if packages were constructed in an alternative way, which is not the case here.
/** | ||
* A flag to control whether excluded scopes and paths should be skipped during the analysis. | ||
*/ | ||
val skipExcluded: Boolean = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document this new option. I guess the best place to do this would be this file:
https://github.com/oss-review-toolkit/ort/blob/main/docs/config-file-ort-yml.md#excludes
In there, you'd also need to adjust this sentence, as this changes with this PR:
Note that the excluded parts are analyzed and scanned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this is on my agenda, but I want to get this merged first to be sure that there are no more changes in naming or otherwise. I will then open a follow-up PR.
This flag indicates that dependencies in excluded scopes should not be added to the dependency graph. For large projects, this could reduce the size of the dependency graph and the analysis time significantly.