Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large projects - cache files not changed #1848

Open
inoa-jboliveira opened this issue Dec 5, 2024 · 2 comments
Open

Large projects - cache files not changed #1848

inoa-jboliveira opened this issue Dec 5, 2024 · 2 comments

Comments

@inoa-jboliveira
Copy link

Hi, I am experimenting with pytype to help annotate a project that is about 2 MLOC of Python, so I ran pytype somedir and it takes about two hour to process ~750 files, according to the counter.

That would not be a problem for me, but somehow if I run it again, it will take the same amount of time. I would expect it to rely on cache and not reprocess files that were not changed so it is possible I am doing something wrong.

By the way, the actual command I use is

uv run --with pytype -- pytype

That is similar to uvx or pipx but it will run on the same virtualenv as the project I am located without fully adding the dependency as I am still experimenting with the project. Not sure it makes any difference, but it should not.

@h-joo
Copy link
Contributor

h-joo commented Dec 9, 2024

Hi, I'm not sure if I understood your use case but pytype does not have a built-in caching method when checking files.
You'd need to depend on whatever build system to do it for you. For instance, we have an internal (I tried to look for it but it seems in-existent in the open source world) bazel rule to do this for us, check the timestamp of a given file to see whether a file needs to be newly type checked or not.

@inoa-jboliveira
Copy link
Author

inoa-jboliveira commented Dec 9, 2024

Hi @h-joo thank you for your reply.
The issue I see is that I am doing

pytype some-dir

And that directory contains about 10 files but they import another dir with several hundred files. And pytype tries to reanalize all the files over and over, not just the ones in the dir I selected.

I would expect it to look into the pyi folder and realize the data is there. All it would require to verify the timestamp of the pyi is greater than the file. Or maybe a hash.

Seems so straightforward that either

  1. I am doing something wrong
  2. There is more to it than the pyi files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants