Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimize module-level imports for __init__.py files #16985

Open
miketheman opened this issue Oct 28, 2024 · 1 comment
Open

Minimize module-level imports for __init__.py files #16985

miketheman opened this issue Oct 28, 2024 · 1 comment
Labels
developer experience Anything that improves the experience for Warehouse devs needs discussion a product management/policy issue maintainers and users should discuss testing Test infrastructure and individual tests

Comments

@miketheman
Copy link
Member

Over time, modules were converted to packages by renaming a file like foo.py to foo/__init__.py thus preserving import foo kind of statements.

While generally fine, this has some odd implications on partial imports during traversals (I'll admit, I'm a little hazy on the specifics of how that happens).

During pytest runs, this often manifests as a circular import error from a partially imported module, leading to solutions like #13737

An audit of "__init__.py files with more than 11 LOC (the size of our license header)" has 36 hits:

$ find . -name "__init__.py" -exec sh -c 'if [ $(wc -l < "{}") -gt 11 ]; then echo "{}"; fi' \;
./tests/common/db/__init__.py
./warehouse/packaging/__init__.py
./warehouse/organizations/__init__.py
./warehouse/metrics/__init__.py
./warehouse/attestations/__init__.py
./warehouse/cache/origin/__init__.py
./warehouse/macaroons/caveats/__init__.py
./warehouse/macaroons/__init__.py
./warehouse/rate_limiting/__init__.py
./warehouse/tuf/__init__.py
./warehouse/forklift/__init__.py
./warehouse/admin/__init__.py
./warehouse/oidc/forms/__init__.py
./warehouse/oidc/__init__.py
./warehouse/oidc/models/__init__.py
./warehouse/legacy/api/xmlrpc/cache/__init__.py
./warehouse/legacy/api/xmlrpc/__init__.py
./warehouse/utils/__init__.py
./warehouse/utils/db/__init__.py
./warehouse/cli/__init__.py
./warehouse/cli/db/__init__.py
./warehouse/captcha/__init__.py
./warehouse/banners/__init__.py
./warehouse/subscriptions/__init__.py
./warehouse/search/__init__.py
./warehouse/sponsors/__init__.py
./warehouse/integrations/__init__.py
./warehouse/integrations/vulnerabilities/osv/__init__.py
./warehouse/integrations/vulnerabilities/__init__.py
./warehouse/accounts/__init__.py
./warehouse/authnz/__init__.py
./warehouse/manage/__init__.py
./warehouse/manage/views/__init__.py
./warehouse/i18n/__init__.py
./warehouse/helpdesk/__init__.py
./warehouse/email/__init__.py

Instead of trying to tackle all of that, look only at the ones that have another subdirectory (package) alongside the __init__.py file, since the issue seems to be with depth traversal.
10 hits:

$ find . -name "__init__.py" -exec sh -c '
  if [ $(wc -l < "{}") -gt 11 ]; then
    dir=$(dirname "{}")
    if [ $(find "$dir" -mindepth 1 -type d | wc -l) -gt 0 ]; then
      echo "{}"
    fi
  fi
' \;
./warehouse/macaroons/__init__.py
./warehouse/admin/__init__.py
./warehouse/oidc/__init__.py
./warehouse/legacy/api/xmlrpc/__init__.py
./warehouse/utils/__init__.py
./warehouse/cli/__init__.py
./warehouse/integrations/__init__.py
./warehouse/integrations/vulnerabilities/__init__.py
./warehouse/manage/__init__.py
./warehouse/email/__init__.py

Moving most of the non-includeme() and __all__ code out of __init__.py files should be relatively straightforward, and will keep import traversals "clean" of unintended side effects or partial initializations.

@miketheman miketheman added testing Test infrastructure and individual tests needs discussion a product management/policy issue maintainers and users should discuss developer experience Anything that improves the experience for Warehouse devs labels Oct 28, 2024
@webknjaz
Copy link
Member

Just wanted to share that this is how I test for circular imports in many projects now: https://github.com/sanitizers/octomachinery/blob/09b5e5c/tests/circular_imports_test.py

As for not having imports in init modules, I sometimes go as far as disallowing them in favor of having an explicit api.py instead: https://github.com/ansible/awx-plugins/blob/9ec3d58/.pre-commit-config.yaml#L21-L25 / https://github.com/aio-libs/propcache/blob/8254a53/src/propcache/api.py

As to why this is happening, you can imagine that a simple import a.b.c will first evaluate a/__init__.py and a/b/__init__.py which in turn may have from . import d and a/b/d.py would have from .. import x. And so it'll go into another import of the same namespace (a) while it's still in progress. Something along these lines. Whenever I hit it it's always something to do with import-time side effects that weren't obvious/intended. So now, I always try to have this circular import test in libs/frameworks, even though it's rather expensive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developer experience Anything that improves the experience for Warehouse devs needs discussion a product management/policy issue maintainers and users should discuss testing Test infrastructure and individual tests
Projects
None yet
Development

No branches or pull requests

2 participants