Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BulkIndexError and ScanError serializable #2669

Merged
merged 4 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions elasticsearch/helpers/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,26 @@
# specific language governing permissions and limitations
# under the License.

from typing import Any, Dict, List
from typing import Any, Dict, List, Tuple, Type


class BulkIndexError(Exception):
def __init__(self, message: Any, errors: List[Dict[str, Any]]):
def __init__(self, message: str, errors: List[Dict[str, Any]]):
super().__init__(message)
self.errors: List[Dict[str, Any]] = errors

def __reduce__(
self,
) -> Tuple[Type["BulkIndexError"], Tuple[str, List[Dict[str, Any]]]]:
return (self.__class__, (self.args[0], self.errors))


class ScanError(Exception):
scroll_id: str

def __init__(self, scroll_id: str, *args: Any, **kwargs: Any) -> None:
super().__init__(*args, **kwargs)
def __init__(self, scroll_id: str, shards_message: str) -> None:
Copy link
Contributor

@miguelgrinberg miguelgrinberg Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the typing changes in both exeception classes and the argument changes in ScanError necessary? I think you could have just added a __reduce__ method that writes all the arguments assuming the caller is passing pickle-friendly data. Not a huge deal, but to me it seems there is no point in restricting the arguments in ScanError, given that it is a backwards-incompatible change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not necessary, but **kwargs was simply not working, and *args was overly broad. Restricting to what was actually used also really improved the types. Are you thinking that other libraries could be raising those exceptions with different parameters?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/search?q=%22raise+BulkIndexError%22+language%3APython&type=code shows that raising BulkIndexError() is pretty common, but I've only changed the type annotation here, and all the samples I've seen indeed pass a string.

I've not seen public examples of raising ScanError, but I can restore *args.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've restored backwards-compatibility in 0db50e4 (#2669), please take another look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's highly unlikely that someone is raising ScanError, but yes, I just thought it was a change that is unrelated to making the class work with pickle. But as I said, it's a small observation, I think it is unlikely to cause problems.

super().__init__(shards_message)
self.scroll_id = scroll_id

def __reduce__(self) -> Tuple[Type["ScanError"], Tuple[str, str]]:
return (self.__class__, (self.scroll_id, self.args[0]))
17 changes: 17 additions & 0 deletions test_elasticsearch/test_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
# specific language governing permissions and limitations
# under the License.

import pickle
import threading
import time
from unittest import mock
Expand Down Expand Up @@ -182,3 +183,19 @@ class TestExpandActions:
@pytest.mark.parametrize("action", ["whatever", b"whatever"])
def test_string_actions_are_marked_as_simple_inserts(self, action):
assert ({"index": {}}, b"whatever") == helpers.expand_action(action)


def test_serialize_bulk_index_error():
error = helpers.BulkIndexError("message", [{"error": 1}])
pickled = pickle.loads(pickle.dumps(error))
assert pickled.__class__ == helpers.BulkIndexError
assert pickled.errors == error.errors
assert pickled.args == error.args


def test_serialize_scan_error():
error = helpers.ScanError("scroll_id", "shard_message")
pickled = pickle.loads(pickle.dumps(error))
assert pickled.__class__ == helpers.ScanError
assert pickled.scroll_id == error.scroll_id
assert pickled.args == error.args
Loading