Skip to content

Commit

Permalink
454 Multiple pickle support (#460)
Browse files Browse the repository at this point in the history
* allow for roundtrips of cloudpaths through pickle serialization (#454)

This avoids an exception thrown because the _client is not serialized
into the pickled object, and thus when __getstate__ is called the second
time, there is no _client field to delete.

Closes #450

* pickle tests

---------

Co-authored-by: Aaron Taylor <[email protected]>
  • Loading branch information
pjbull and kujenga authored Aug 21, 2024
1 parent a23a38c commit 7cbff39
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 20 deletions.
1 change: 1 addition & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## UNRELEASED

- Allow `CloudPath` objects to be loaded/dumped through pickle format repeatedly. (Issue [#450](https://github.com/drivendataorg/cloudpathlib/issues/450))
- Fixes typo in `FileCacheMode` where values were being filled by envvar `CLOUPATHLIB_FILE_CACHE_MODE` instead of `CLOUDPATHLIB_FILE_CACHE_MODE`. (PR [#424](https://github.com/drivendataorg/cloudpathlib/pull/424)
- Fix `CloudPath` cleanup via `CloudPath.__del__` when `Client` encounters an exception during initialization and does not create a `file_cache_mode` attribute. (Issue [#372](https://github.com/drivendataorg/cloudpathlib/issues/372), thanks to [@bryanwweber](https://github.com/bryanwweber))
- Drop support for Python 3.7; pin minimal `boto3` version to Python 3.8+ versions. (PR [#407](https://github.com/drivendataorg/cloudpathlib/pull/407))
Expand Down
3 changes: 2 additions & 1 deletion cloudpathlib/cloudpath.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,8 @@ def __getstate__(self) -> Dict[str, Any]:
state = self.__dict__.copy()

# don't pickle client
del state["_client"]
if "_client" in state:
del state["_client"]

return state

Expand Down
19 changes: 0 additions & 19 deletions tests/test_cloudpath_file_io.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from datetime import datetime
import os
from pathlib import Path, PurePosixPath
import pickle
from shutil import rmtree
import sys
from time import sleep
Expand Down Expand Up @@ -454,24 +453,6 @@ def test_os_open(rig):
assert f.readable()


def test_pickle(rig, tmpdir):
p = rig.create_cloud_path("dir_0/file0_0.txt")

with (tmpdir / "test.pkl").open("wb") as f:
pickle.dump(p, f)

with (tmpdir / "test.pkl").open("rb") as f:
pickled = pickle.load(f)

# test a call to the network
assert pickled.exists()

# check we unpickled, and that client is the default client
assert str(pickled) == str(p)
assert pickled.client == p.client
assert rig.client_class._default_client == pickled.client


def test_drive_exists(rig):
"""Tests the exists call for top level bucket/container"""
p = rig.create_cloud_path("dir_0/file0_0.txt")
Expand Down
32 changes: 32 additions & 0 deletions tests/test_cloudpath_serialize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import pickle

from cloudpathlib import CloudPath


def test_pickle(rig, tmpdir):
p = rig.create_cloud_path("dir_0/file0_0.txt")

with (tmpdir / "test.pkl").open("wb") as f:
pickle.dump(p, f)

with (tmpdir / "test.pkl").open("rb") as f:
pickled = pickle.load(f)

# test a call to the network
assert pickled.exists()

# check we unpickled, and that client is the default client
assert str(pickled) == str(p)
assert pickled.client == p.client
assert rig.client_class._default_client == pickled.client


def test_pickle_roundtrip():
path1 = CloudPath("s3://bucket/key")
pkl1 = pickle.dumps(path1)

path2 = pickle.loads(pkl1)
pkl2 = pickle.dumps(path2)

assert path1 == path2
assert pkl1 == pkl2

0 comments on commit 7cbff39

Please sign in to comment.