Skip to content

Multiple pickle roundtrip serializations cause KeyError #450

Closed
@kujenga

Description

@kujenga

The following code causes a crash, which can arise when passing cloudpathlib objects to/from subprocesses.

    path1 = CloudPath("s3://bucket/key")
    print(path1.__dict__)
    pkl1 = pickle.dumps(path1)

    path2 = pickle.loads(pkl1)
    print(path2.__dict__)
    pkl2 = pickle.dumps(path2)

    path3 = pickle.loads(pkl2)
    print(path3.__dict__)

The exception raised is:

{'_handle': None, '_client': None, '_str': 's3://bucket/key', '_url': ParseResult(scheme='s3', netloc='bucket', path='/key', params='', query='', fragment=''), '_path': PurePosixPath('/bucket/key'), '_dirty': False}
{'_handle': None, '_str': 's3://bucket/key', '_url': ParseResult(scheme='s3', netloc='bucket', path='/key', params='', query='', fragment=''), '_path': PurePosixPath('/bucket/key'), '_dirty': False}
Traceback (most recent call last):
  File "/path/to/cloudpathlib-pickle-crash.py", line 20, in <module>
    main()
  File "/path/to/cloudpathlib-pickle-crash.py", line 13, in main
    pkl2 = pickle.dumps(path2)
           ^^^^^^^^^^^^^^^^^^^
  File "/path/to/.venv/lib/python3.11/site-packages/cloudpathlib/cloudpath.py", line 266, in __getstate__
    del state["_client"]
        ~~~~~^^^^^^^^^^^
KeyError: '_client'

The issue seems to be that these methods are not symmetrical, and the _client field is not restored when the object is unpickled in a way that might mirror this example in the docs https://docs.python.org/3/library/pickle.html#pickle-state at this point in the code:

def __getstate__(self) -> Dict[str, Any]:
state = self.__dict__.copy()
# don't pickle client
del state["_client"]
return state
def __setstate__(self, state: Dict[str, Any]) -> None:
self.__dict__.update(state)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions