Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError when __del__ is called on S3Client that errors during __init__ #372

Closed
bryanwweber opened this issue Oct 30, 2023 · 9 comments · Fixed by #373, #375 or #418
Closed

AttributeError when __del__ is called on S3Client that errors during __init__ #372

bryanwweber opened this issue Oct 30, 2023 · 9 comments · Fixed by #373, #375 or #418
Labels
bug Something isn't working

Comments

@bryanwweber
Copy link
Contributor

bryanwweber commented Oct 30, 2023

If an exception is raised during the creation of an S3Client instance, subsequently tearing down that instance via Client.__del__ results n an AttributeError:

poetry run python        
Python 3.11.5 (main, Sep 22 2023, 12:39:47) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cloudpathlib
>>> cloudpathlib.S3Client(profile_name="some-bad-profile")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".venv/lib/python3.11/site-packages/cloudpathlib/s3/s3client.py", line 86, in __init__
    self.sess = Session(
                ^^^^^^^^
  File "venv/lib/python3.11/site-packages/boto3/session.py", line 90, in __init__
    self._setup_loader()
  File ".venv/lib/python3.11/site-packages/boto3/session.py", line 131, in _setup_loader
    self._loader = self._session.get_component('data_loader')
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/session.py", line 802, in get_component
    return self._components.get_component(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/session.py", line 1140, in get_component
    self._components[name] = factory()
                             ^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/session.py", line 199, in <lambda>
    lambda: create_loader(self.get_config_variable('data_path')),
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.11/site-packages/botocore/session.py", line 323, in get_config_variable
    return self.get_component('config_store').get_config_variable(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/configprovider.py", line 459, in get_config_v
ariable
    return provider.provide()
           ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/configprovider.py", line 665, in provide
    value = provider.provide()
            ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/configprovider.py", line 755, in provide
    scoped_config = self._session.get_scoped_config()
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/botocore/session.py", line 422, in get_scoped_config
    raise ProfileNotFound(profile=profile_name)
botocore.exceptions.ProfileNotFound: The config profile (some-bad-profile) could not be found
>>> quit()
Exception ignored in: <function Client.__del__ at 0x1031bcc20>
Traceback (most recent call last):
  File ".venv/lib/python3.11/site-packages/cloudpathlib/client.py", line 88, in __del__
    if self.file_cache_mode in [
       ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'S3Client' object has no attribute 'file_cache_mode'

This is happening because the super().__init__() call happens at the bottom of the S3Client.__init__() method (here), so Client.file_cache_mode is not set before the error occurs. I checked and there doesn't seem to be a particular reason that super().__init__() needs to be called at the bottom (all the args are just pass through from the S3Client.__init__() signature). I'm happy to submit a PR to fix this by moving the super().__init__() up in S3Client.__init__() and other related classes too, if that seems like the right fix. Thanks!

@bryanwweber
Copy link
Contributor Author

To be clear, this is not a blocker for anything, just a small nit that seems pretty easy to clean up 😄

@jayqi
Copy link
Member

jayqi commented Oct 30, 2023

there doesn't seem to be a particular reason that super().__init__() needs to be called at the bottom (all the args are just pass through from the S3Client.__init__() signature)

Just took a quick skim, and it seems like that to me as well. Thanks for diagnosing, and PR is welcome!

@sbrandtb
Copy link

This one seems to be back in 0.18.1:

Exception ignored in: <function CloudPath.__del__ at 0x7f0ace1ef420>
Traceback (most recent call last):
  File "/virtualenv/.venv/lib/python3.12/site-packages/cloudpathlib/cloudpath.py", line 250, in __del__
    and self.client.file_cache_mode == FileCacheMode.cloudpath_object
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'S3Client' object has no attribute 'file_cache_mode'
Exception ignored in: <function CloudPath.__del__ at 0x7f0ace1ef420>
Traceback (most recent call last):
  File "/virtualenv/.venv/lib/python3.12/site-packages/cloudpathlib/cloudpath.py", line 250, in __del__
    and self.client.file_cache_mode == FileCacheMode.cloudpath_object
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'S3Client' object has no attribute 'file_cache_mode'

@jayqi jayqi reopened this Mar 28, 2024
@jayqi
Copy link
Member

jayqi commented Mar 28, 2024

From a quick look here, it looks like we fixed it in #373 for the __del__ finalizer on the Client class but not for the __del__ finalizer on the CloudPath class.

@jayqi jayqi added the bug Something isn't working label Mar 28, 2024
@bryanwweber
Copy link
Contributor Author

bryanwweber commented Apr 1, 2024

@sbrandtb Can you post a full reproducer script? I don't immediately see how it can get into a state where client.file_cache_mode is not assigned.

It looks like this was last changed in 088fe49. I suppose if the S3 connection fails for some reason you could run into this.

@jayqi
Copy link
Member

jayqi commented Apr 1, 2024

This should be addressed by #418/#419. I wasn't able to reproduce the CloudPath.__del__ error either (I was able to reproduce the original error with Client.__del__), but I think the fix doesn't hurt.

@sbrandtb
Copy link

sbrandtb commented Apr 2, 2024

@bryanwweber @jayqi

In this case, and because the issue appears only rarely on our side, I would assume it happens in a multithreading scenario. We basically upload files in parallel with workers, but only a fraction of the workers actually upload a file. The simplified code:

from multiprocessing.pool import ThreadPool

from cloudpathlib import S3Path

def do(i):
    dst = S3Path('s3://somewhere') / f'hello_{i}.txt'
    dst.write_bytes(b'hello')

ThreadPool(processes=20).map(do, range(100))

@jayqi jayqi closed this as completed Apr 2, 2024
@pjbull
Copy link
Member

pjbull commented Aug 29, 2024

@bryanwweber @sbrandtb This is realeased now in v0.19.0 if you want to upgrade and test your use case.

@sbrandtb
Copy link

@pjbull Thanks for fixing and the personal notification ❤️

Honestly, we wrapped the deployment code where it happened in retry and called it a day. Our logs show that the issue happened still 5 times in August, but zero times since the release of the patch. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants