Skip inactive repos #3559

fthomas · 2025-01-21T08:28:00Z

This implements the idea proposed in #3554 to stop creating wasted PRs by skipping repos whose last commit is older than a configured threshold. The PR contains these changes:

a new GitAlg.getCommitDate which returns the commit date as Timestamp of a given SHA-1
a new RepoCache.commitDate: Timestamp field that contains the commit date that corresponds to RepoCache.sha1
a new RepoConfig.lastCommitMaxAge: Option[FiniteDuration] field that contains the threshold for deciding if a repo should be skipped. A repo is skipped, if the last commit is greater than this FiniteDuration.
a default lastCommitMaxAge is added to the default Scala Steward config. Other Scala Steward instances and every repo can override this with a greater value. It is not possible to unset it or override it with a smaller value.
RepoCache.checkCache is modified such that after retrieving a fresh cache, the commit date is compared to lastCommitMaxAge and an error is raised eventually

Notes:

The addition of a non optional field to RepoCache means that reading an old cache will log an error like this:

2025-01-20 21:16:44,985 ERROR Failed to parse or decode JSON from workspace/store/repo_cache/v1/github/scala-steward-org/test-repo-1/repo_cache.json
io.circe.DecodingFailure$DecodingFailureImpl: DecodingFailure at .commitDate: Missing required field

Fortunately this is only a log message so that Scala Steward will recompute the RepoCache and proceed as normal then. Adding this field is therefore effectively invalidating all RepoCaches out there.

A skipped repo looks like this in the log:

2025-01-21 08:35:25,033 INFO  ──────────── Steward scala-steward-org/test-repo-1:scala-cli-test ────────────
2025-01-21 08:35:25,033 INFO  Check cache of scala-steward-org/test-repo-1:scala-cli-test
2025-01-21 08:35:25,738 ERROR Steward scala-steward-org/test-repo-1:scala-cli-test failed
org.scalasteward.core.repocache.RepoCacheAlg$$anon$1: Skipping because last commit is older than 2d
2025-01-21 08:35:25,738 INFO  ──────────── Total time: Steward scala-steward-org/test-repo-1:scala-cli-test: 705ms ────────────

lastCommitMaxAge was set to 2 days in this case.

Closes: #3554

codecov · 2025-01-21T08:31:32Z

Codecov Report

Attention: Patch coverage is 88.46154% with 3 lines in your changes missing coverage. Please review.

Project coverage is 89.84%. Comparing base (f5a5a09) to head (2cc4057).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
...org/scalasteward/core/repocache/RepoCacheAlg.scala	72.72%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3559      +/-   ##
==========================================
+ Coverage   89.81%   89.84%   +0.02%     
==========================================
  Files         171      171              
  Lines        4988     5010      +22     
  Branches      495      490       -5     
==========================================
+ Hits         4480     4501      +21     
- Misses        508      509       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mzuehlke

Only a small change to add such a valuable feature 👍
Not sure where, but would be good to add a note about this feature.
Maybe a short above the value in default.scala-steward.conf ?

And 18 month of inactivity ? I think I would have choosen at most 12 month. But I have no clue if this would be too short.

mzuehlke · 2025-01-22T21:43:52Z

Just checked, the public instance check uses 24 month: https://github.com/VirtusLab/scala-steward-repos/blob/main/scripts/test-repos.scala#L32

exoego

This is awesome.

I feel 540 days is very tolerant but fine.
It is common not to touch repo in maintenance-mode for months in work😌

Dependabot: 90 days
https://docs.github.com/en/code-security/dependabot/working-with-dependabot/managing-pull-requests-for-dependency-updates

alejandrohdezma

Nice! I'll make sure to add this option to the GH action once this is out

fthomas · 2025-01-23T06:58:24Z

Not sure where, but would be good to add a note about this feature.
Maybe a short above the value in default.scala-steward.conf ?

Will do. I'll also add it to https://github.com/scala-steward-org/scala-steward/blob/main/docs/repo-specific-configuration.md.

Just checked, the public instance check uses 24 month: https://github.com/VirtusLab/scala-steward-repos/blob/main/scripts/test-repos.scala#L32

Thanks for the pointer, I didn't know about this script! I'll take a look. Also pinging @tgodzik since he wrote that.

And 18 month of inactivity ? I think I would have choosen at most 12 month.

Dependabot: 90 days

I'm totally fine with decreasing the default.

Btw, calling a repo inactive instead of abandoned sounds also nicer and less dramatic. I'll update the code accordingly. It is probably also better to rename lastCommitMaxAge to something like inactivityThreshold which better communicates the intent of that config and does not leak the implementation.

tgodzik · 2025-01-23T08:12:39Z

Thanks for the pointer, I didn't know about this script! I'll take a look. Also pinging @tgodzik since he wrote that.

This is my manual approach to do the same, but I feel like yours is a much better. I just didn't have the time to delve into steward details before 😅 I did remove most things older that 2 years and added watch to all of them to be notified if someone started to work on it again.

Just checked, the public instance check uses 24 month: https://github.com/VirtusLab/scala-steward-repos/blob/main/scripts/test-repos.scala#L32

It was my best guess since I actually saw some repos that were for projects that were for sure used, just didn't need a lot of work recently, so nobody bothered. But I think if we do it automatically then it totally fine to make it shorter. I would probably try to make it so that we always get the last scala version update. Some smaller projects only need that and update 2.13.x rarely as the releases are less frequent now.

I wonder if we could also check if a repo is archived? Sometime it might be younger than a threshold but still no longer used.

And also would we be able to post an issue maybe if the steward failed? That would notify the maintainers that something is wrong. Though that might be a bigger change for another PR

fthomas · 2025-01-23T20:13:51Z

I would probably try to make it so that we always get the last scala version update. Some smaller projects only need that and update 2.13.x rarely as the releases are less frequent now.

Makes sense. A threshold between 180 and 360 days should be sufficient for repos that only receive Scala 2.13 updates. Also, as instance operator you can always increase the default by running with a instance-wide repo config.

I wonder if we could also check if a repo is archived?

This exists since #2507 and it should be easy to grep for archived repos in the logs. Here is an example:

2025-01-23 20:18:06,924 INFO  ──────────── Steward fthomas/typelevel-webapp ────────────
2025-01-23 20:18:06,925 INFO  Check cache of fthomas/typelevel-webapp
2025-01-23 20:18:10,184 ERROR Steward fthomas/typelevel-webapp failed
org.scalasteward.core.forge.github.GitHubException$RepositoryArchived: fthomas/typelevel-webapp
2025-01-23 20:18:10,186 INFO  ──────────── Total time: Steward fthomas/typelevel-webapp: 3s 261ms ────────────

For inactive repos, I should also add a dedicated exception to make grepping for them easier.

And also would we be able to post an issue maybe if the steward failed? That would notify the maintainers that something is wrong. Though that might be a bigger change for another PR

We have an issue for that: #2030. And for sure this is not as easy as this change. :-)

fthomas added the enhancement New feature or request label Jan 21, 2025

fthomas added this to the 0.33.0 milestone Jan 21, 2025

fthomas force-pushed the topic/skip-abandoned-repos branch 3 times, most recently from 2ea5470 to 969d2e7 Compare January 22, 2025 20:07

fthomas marked this pull request as ready for review January 22, 2025 20:15

Skip abandoned repos

fb06636

fthomas force-pushed the topic/skip-abandoned-repos branch from 969d2e7 to fb06636 Compare January 22, 2025 20:17

mzuehlke approved these changes Jan 22, 2025

View reviewed changes

exoego approved these changes Jan 22, 2025

View reviewed changes

alejandrohdezma approved these changes Jan 23, 2025

View reviewed changes

abandoned -> inactive

4ff5d19

fthomas changed the title ~~Skip abandoned repos~~ Skip inactive repos Jan 24, 2025

fthomas added 2 commits January 24, 2025 15:15

reduce threshold, use dedicated exception, rename lastCommitMaxAge

c7bdd22

Add comment to the default value

2cc4057

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip inactive repos #3559

Skip inactive repos #3559

fthomas commented Jan 21, 2025 •

edited

Loading

codecov bot commented Jan 21, 2025 •

edited

Loading

mzuehlke left a comment

mzuehlke commented Jan 22, 2025

exoego left a comment

alejandrohdezma left a comment

fthomas commented Jan 23, 2025

tgodzik commented Jan 23, 2025

fthomas commented Jan 23, 2025

Skip inactive repos #3559

Are you sure you want to change the base?

Skip inactive repos #3559

Conversation

fthomas commented Jan 21, 2025 • edited Loading

codecov bot commented Jan 21, 2025 • edited Loading

Codecov Report

mzuehlke left a comment

Choose a reason for hiding this comment

mzuehlke commented Jan 22, 2025

exoego left a comment

Choose a reason for hiding this comment

alejandrohdezma left a comment

Choose a reason for hiding this comment

fthomas commented Jan 23, 2025

tgodzik commented Jan 23, 2025

fthomas commented Jan 23, 2025

fthomas commented Jan 21, 2025 •

edited

Loading

codecov bot commented Jan 21, 2025 •

edited

Loading