Skip to content

Avoid redundant glob checking #10520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

parsonsmatt
Copy link
Collaborator

@parsonsmatt parsonsmatt commented Nov 4, 2024

This PR modifies the glob logic to avoid doing redundant work. Results of operations are now passed forward into further checks.

Locally, this improves the time for a no-op cabal repl from 50s to 32s.

Combined with my other PRs on this, I think this whole operation on our large extra-source-files will go from 35s to 65ms 😄


Template Α: This PR modifies behaviour or interface

Include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.
  • Manual QA notes have been included.
  • Tests have been added. (Ask for help if you don’t know how to write them! Ask for an exemption if tests are too complex for too little coverage!)

Template B: This PR does not modify behaviour or interface

E.g. the PR only touches documentation or tests, does refactorings, etc.

Include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Is this a PR that fixes CI? If so, it will need to be backported to older cabal release branches (ask maintainers for directions).

Comment on lines +515 to +516
extraSrcFilesGlobResults <- mapM (checkGlobFile "." "extra-source-files") extraSrcGlobs
extraDocFilesGlobResults <- mapM (checkGlobFile "." "extra-doc-files") docGlobs
extraFilesGlobResults <- mapM (checkGlobFile "." "extra-files") extraGlobs
extraDataFilesGlobResults <- mapM (checkGlobFile rawDataDir "data-files") dataGlobs
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of re-parsing each FilePath here, we reuse the globs parsed above in checkGlob.

Comment on lines +520 to +523
-- § Missing documentation.
checkMissingDocs
extraDataFilesGlobResults
extraSrcFilesGlobResults
extraDocFilesGlobResults
extraFilesGlobResults
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And instead of re-parsing and then re-running the glob find, we just use the results generated above.

Comment on lines +846 to +859
=> FilePath -- Folder to check.
-> CabalField -- .cabal field we are checking.
-> CheckM m ()
checkGlobFile cv ddir title fp = do
-> Glob -- Glob pattern.
-> CheckM m [GlobResult FilePath]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these args are all just String 🤪 and were incorrectly documented before.

Nothing ->
pure []
Just po -> do
rs <- liftCM $ runDirFileGlobM po dir parsedGlob
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

liftCM was not previously exported, but it is necessary to actually use this if the liftint interface doesn't suffice - and since that interface has a forced return of (), it can't work here.

@@ -37,6 +37,7 @@ module Distribution.PackageDescription.Check.Monad
, checkP
, checkPkg
, liftInt
, liftCM
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to expose this I can probably replace with a function like liftInt but that allows a secondary return, like,

liftIntWith 
    :: (CheckInterface m -> Maybe (i m))
    -> (i m -> m ([PackageCheck], r)
    -> CheckM m (Maybe r)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing it is fine.

res <- realGlob esgs
red <- realGlob edgs
ref <- realGlob efgs
concatMap globMatches t
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can simplify this to globMatches t and rely on callers to concat the args, too 🤷🏻

@parsonsmatt parsonsmatt marked this pull request as ready for review November 4, 2024 18:32
@geekosaur
Copy link
Collaborator

BTW the API change label is neither a complaint nor a barrier to getting it into 3.14; it does mean backporting to 3.12 isn't possible.

Copy link
Collaborator

@alt-romes alt-romes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good @parsonsmatt. Thanks for the patch.

@@ -37,6 +37,7 @@ module Distribution.PackageDescription.Check.Monad
, checkP
, checkPkg
, liftInt
, liftCM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposing it is fine.

@Mikolaj
Copy link
Member

Mikolaj commented Mar 13, 2025

So, shall we merge it? @parsonsmatt: would you like to set a merge label as per CONTRIBUTING.md and rebase/fix CI?

@parsonsmatt
Copy link
Collaborator Author

./validate.sh passes locally for me, but is failing in CI. I'm unsure of the best way to proceed. Looks like the issue is that some of the warnings were being generated by these steps but are no longer being generated. I'm unsure why this is happening.

@ffaf1
Copy link
Collaborator

ffaf1 commented Mar 31, 2025

-These warnings may cause trouble when distributing the package:
-Warning: [no-glob-match] In 'extra-source-files': the pattern '/home/user/file' does not match any files.

This is GlobNoMatch, but checkGlobResult seems fine, so it has to be the way/where it is invoked.

@Mikolaj
Copy link
Member

Mikolaj commented Apr 10, 2025

Let me take the liberty to rebase --- maybe the problems were transient?

@Mikolaj
Copy link
Member

Mikolaj commented Apr 10, 2025

@mergify rebase

Copy link
Contributor

mergify bot commented Apr 10, 2025

rebase

✅ Branch has been successfully rebased

@Mikolaj Mikolaj force-pushed the mattp/reuse-glob-information branch from 350003d to 3d309d5 Compare April 10, 2025 18:03
@Mikolaj
Copy link
Member

Mikolaj commented Apr 10, 2025

Nope, doesn't seem so. :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants