Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only create distinct changesets #2019

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

blast-hardcheese
Copy link
Contributor

Attempting to resolve #1995

Problem

Currently, when multiple artifacts share the same version number pinned by the same val binding, we end up seeing multiple PRs with the exact same changeset

Examples

Strategy in this PR

Attempting to deduplicate before we got to applyUpdate wasn't a viable strategy, due to the various different heuristics that could result in different commits.

Instead, track a Ref[F, List[Branch]] of every branch that we have walked in this run, then immediately after applyUpdate but before pushCommits, ensure that there are no branches that we have touched so far in this run that have the same changeset (as determined by the line count of git diff other-branch).

This will result in a little more IO churn, diffing between every other open and curated branch, but it should have a positive impact on users.

If you've got advice on how to write a more effective test for this new functionality, I'm open to it -- it seemed as though NurtureAlgTest would be the right place to put this, but there's not a lot of structure there to base my attempt on.

Thank you

As always, a tremendous thank you for this fantastic tool.

@blast-hardcheese blast-hardcheese force-pushed the only-create-distinct-changesets branch from 7cf89ac to 39d3211 Compare March 20, 2021 07:23
@codecov
Copy link

codecov bot commented Mar 20, 2021

Codecov Report

Merging #2019 (b1c949d) into master (f68c614) will increase coverage by 0.15%.
The diff coverage is 43.24%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2019      +/-   ##
==========================================
+ Coverage   78.58%   78.74%   +0.15%     
==========================================
  Files         131      132       +1     
  Lines        2251     2272      +21     
  Branches       54       47       -7     
==========================================
+ Hits         1769     1789      +20     
- Misses        482      483       +1     
Impacted Files Coverage Δ
...ala/org/scalasteward/core/nurture/NurtureAlg.scala 4.90% <0.00%> (+0.18%) ⬆️
...scala/org/scalasteward/core/nurture/ApplyAlg.scala 52.63% <52.63%> (ø)
...la/org/scalasteward/core/application/Context.scala 86.66% <100.00%> (+0.22%) ⬆️
...n/scala/org/scalasteward/core/git/FileGitAlg.scala 98.14% <100.00%> (+0.03%) ⬆️
...in/scala/org/scalasteward/core/git/GenGitAlg.scala 92.30% <100.00%> (+0.30%) ⬆️
.../main/scala/org/scalasteward/core/io/FileAlg.scala 100.00% <100.00%> (+3.12%) ⬆️
.../scala/org/scalasteward/core/io/WorkspaceAlg.scala 50.00% <0.00%> (+20.00%) ⬆️
.../scala/org/scalasteward/core/data/UpdateData.scala 100.00% <0.00%> (+50.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f68c614...b1c949d. Read the comment docs.

@blast-hardcheese
Copy link
Contributor Author

blast-hardcheese commented Mar 20, 2021

Codacy failed because I used .get on a Ref[F, A]? Baffling

@blast-hardcheese
Copy link
Contributor Author

😅 Gonna think about this one more. If there is something obvious about the project structure I'm missing, I'm all ears, but I'll need to revisit in a few days.

@blast-hardcheese blast-hardcheese force-pushed the only-create-distinct-changesets branch 3 times, most recently from 199fe0a to d67b544 Compare March 23, 2021 07:25
Devon Stewart added 7 commits March 29, 2021 10:27
I don't like this strategy, but I'm interested to see what codecov (and others) think about this approach.
I've basically just reimplemented the business logic in the test, but short of mocking out the whole git repo
in order to extract state, I'm not sure how best to represent what I'm trying to test.
@blast-hardcheese blast-hardcheese force-pushed the only-create-distinct-changesets branch 3 times, most recently from ef94d73 to e8067f5 Compare March 29, 2021 17:58
@blast-hardcheese blast-hardcheese force-pushed the only-create-distinct-changesets branch from e8067f5 to b1c949d Compare March 29, 2021 18:27
@blast-hardcheese
Copy link
Contributor Author

OK, I think codecov/patch is incorrect.

I think the blast radius required to appropriately test this feature ended up being significantly more than initially anticipated, so I humbly submit this for review.

The tests needed to be written against real git running on the real filesystem, which is why I had to largely forego MockContext, but I admit I could have misunderstood something along the way and am happy to make any changes to help increase the overall quality of code in this project while also achieving the new feature I'm attempting to add.

Thanks!

@fthomas
Copy link
Member

fthomas commented Apr 10, 2021

It seems to me that this will only delay subsequent PRs with the same changes to the next run. If Scala Steward creates a PR for an update and that PR is not merged before the next run, Scala Steward will not pass that update to NurtureAlg since there is nothing to do for this update. But the update which has been prevented in the first run is still passed to NurtureAlg so that the previously prevented PR can now be created.

I'm also concerned that there is no principle which PR gets created and which PR is prevented. Currently the first update wins over the others and updates are AFAIK sorted by groupId and artifactId. What if a later update would have resulted in a "more correct" PR?

@blast-hardcheese
Copy link
Contributor Author

What if a later update would have resulted in a "more correct" PR?

Only exactly equivalent changesets are deduplicated, using git itself as the final arbiter instead of trying to guess without modifying the filesystem. We accumulate commits, but only gate the push using this approach.

To your first point, I didn't realize updates were filtered before hitting NurtureAlg. Short of more back channels with distinct data flows, I don't know how to proceed here.

There may be an avenue to pursue, just tracking active branches in the store. This would work in the case you describe, being the initial set of branches to compare against, as well as working in the case of rebasing off master.

What do you think about this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deduplicate identical changesets
2 participants