Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic contrib module #757

Draft
wants to merge 65 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
106f584
Copy of pydantic v1 converter from samples-python
dandavison Jan 28, 2025
4d9943b
Cleanup
dandavison Feb 2, 2025
72c9cd1
update pydantic to v2
dandavison Jan 30, 2025
e9e0a84
Drive-by: use zip_longest
dandavison Jan 31, 2025
819b7ee
Get rid of pydantic sandbox hack
dandavison Jan 31, 2025
67bad54
Expand test coverage
dandavison Feb 3, 2025
2a263cf
Implement type converter and __get_pydantic_core_schema__
dandavison Feb 3, 2025
a4393e4
Rename
dandavison Feb 3, 2025
9669e8b
Clean up
dandavison Feb 3, 2025
b2ab2c9
Extend test coverage
dandavison Feb 3, 2025
f48a993
Add test of mixed type inputs
dandavison Feb 3, 2025
b65781a
Use v2 API to_jsonable_python
dandavison Feb 3, 2025
6813ffa
Use JSONEncoder
dandavison Feb 4, 2025
53b453e
Cleanup
dandavison Feb 4, 2025
6adffc6
README
dandavison Feb 3, 2025
938a218
Retain hack for backwards compatiblity
dandavison Feb 4, 2025
c379acf
Hack: unbreak instantiation of pydantic models
dandavison Feb 4, 2025
0e61991
Test pydantic usage in workflow
dandavison Feb 4, 2025
09d4d9e
Don't restrict datetime instances
dandavison Feb 4, 2025
a7b30a5
Only use pydantic in sandbox if it can be imported
dandavison Feb 4, 2025
e383353
Expand tests
dandavison Feb 4, 2025
e3729eb
Fix __get_pydantic_core_schema__ implementation
dandavison Feb 6, 2025
7a2c785
TEST FAILURE: Refactor tests and use two models
dandavison Feb 6, 2025
ad26ef9
One model
dandavison Feb 6, 2025
c626724
Refactor tests
dandavison Feb 6, 2025
d970b74
test date
dandavison Feb 6, 2025
6362836
Don't restrict date instances
dandavison Feb 6, 2025
c05f346
Test timedelta
dandavison Feb 6, 2025
92bc1cb
Revert "Test timedelta"
dandavison Feb 6, 2025
75aea5d
Rename
dandavison Feb 6, 2025
480a40b
test timedelta
dandavison Feb 6, 2025
780a829
test union field
dandavison Feb 6, 2025
e2bb705
test type hints
dandavison Feb 6, 2025
bc70bfb
Test dunder methods
dandavison Feb 7, 2025
c29f498
Organize tests
dandavison Feb 6, 2025
bd7ccca
Test pathlib.Path
dandavison Feb 7, 2025
2c32442
Fix pathlib.Path usage
dandavison Feb 7, 2025
e0a937c
Delete redundant fix
dandavison Feb 7, 2025
73a330c
Remove redundant isinstance check
dandavison Feb 7, 2025
1e51a19
Deduplicate tests
dandavison Feb 7, 2025
686b18d
Clean up
dandavison Feb 7, 2025
91c098a
Expand
dandavison Feb 7, 2025
4b10f5a
Clean up
dandavison Feb 7, 2025
efe28f6
Disable field types that don't work
dandavison Feb 7, 2025
09e655b
Expand
dandavison Feb 7, 2025
c250f4d
Test special instead of standard model
dandavison Feb 7, 2025
44cece0
datetime variants
dandavison Feb 7, 2025
9201aa1
Add date datetime
dandavison Feb 7, 2025
76cb905
Time fields
dandavison Feb 7, 2025
024547d
Revert "Test special instead of standard model"
dandavison Feb 7, 2025
ab6eade
namedtuple
dandavison Feb 7, 2025
757df57
sequence field
dandavison Feb 7, 2025
3103fa3
iterable field
dandavison Feb 7, 2025
1d5971e
TypedDict
dandavison Feb 7, 2025
419354f
Expand tests
dandavison Feb 7, 2025
827b692
Always pass through typing_extensions
dandavison Feb 8, 2025
33a0016
Make activities generic
dandavison Feb 8, 2025
8717425
Revert "Make activities generic"
dandavison Feb 8, 2025
ac04b46
Reduce tests
dandavison Feb 8, 2025
a58c945
Expand tests
dandavison Feb 8, 2025
e67ac04
Test union fields
dandavison Feb 8, 2025
77d01eb
Complex custom type
dandavison Feb 8, 2025
d51b0dc
Test complex union
dandavison Feb 8, 2025
ede73f6
Rename
dandavison Feb 9, 2025
5e7310c
Use non-list input
dandavison Feb 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 8 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,6 @@ informal introduction to the features and their implementation.
- [Extending Restricted Classes](#extending-restricted-classes)
- [Certain Standard Library Calls on Restricted Objects](#certain-standard-library-calls-on-restricted-objects)
- [is_subclass of ABC-based Restricted Classes](#is_subclass-of-abc-based-restricted-classes)
- [Compiled Pydantic Sometimes Using Wrong Types](#compiled-pydantic-sometimes-using-wrong-types)
- [Activities](#activities)
- [Definition](#definition-1)
- [Types of Activities](#types-of-activities)
Expand Down Expand Up @@ -312,11 +311,6 @@ The default data converter supports converting multiple types including:
* Anything that [`json.dump`](https://docs.python.org/3/library/json.html#json.dump) supports natively
* [dataclasses](https://docs.python.org/3/library/dataclasses.html)
* Iterables including ones JSON dump may not support by default, e.g. `set`
* Any class with a `dict()` method and a static `parse_obj()` method, e.g.
[Pydantic models](https://pydantic-docs.helpmanual.io/usage/models)
* The default data converter is deprecated for Pydantic models and will warn if used since not all fields work.
See [this sample](https://github.com/temporalio/samples-python/tree/main/pydantic_converter) for the recommended
approach.
* [IntEnum, StrEnum](https://docs.python.org/3/library/enum.html) based enumerates
* [UUID](https://docs.python.org/3/library/uuid.html)

Expand All @@ -325,6 +319,14 @@ This notably doesn't include any `date`, `time`, or `datetime` objects as they m
Users are strongly encouraged to use a single `dataclass` for parameter and return types so fields with defaults can be
easily added without breaking compatibility.

To use pydantic model instances (or python objects containing pydantic model instances), use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make this a whole subsection, e.g. ##### Pydantic

```python
from temporalio.contrib.pydantic import pydantic_data_converter

client = Client(data_converter=pydantic_data_converter, ...)
```
Do not use pydantic's [strict mode](https://docs.pydantic.dev/latest/concepts/strict_mode/).

Classes with generics may not have the generics properly resolved. The current implementation does not have generic
type resolution. Users should use concrete types.

Expand Down Expand Up @@ -1133,15 +1135,6 @@ Due to [https://bugs.python.org/issue44847](https://bugs.python.org/issue44847),
checked to see if they are subclasses of another via `is_subclass` may fail (see also
[this wrapt issue](https://github.com/GrahamDumpleton/wrapt/issues/130)).

###### Compiled Pydantic Sometimes Using Wrong Types

If the Pydantic dependency is in compiled form (the default) and you are using a Pydantic model inside a workflow
sandbox that uses a `datetime` type, it will grab the wrong validator and use `date` instead. This is because our
patched form of `issubclass` is bypassed by compiled Pydantic.

To work around, either don't use `datetime`-based Pydantic model fields in workflows, or mark `datetime` library as
passthrough (means you lose protection against calling the non-deterministic `now()`), or use non-compiled Pydantic
dependency.

### Activities

Expand Down
294 changes: 241 additions & 53 deletions poetry.lock

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ grpcio = {version = "^1.48.2", optional = true}
opentelemetry-api = { version = "^1.11.1", optional = true }
opentelemetry-sdk = { version = "^1.11.1", optional = true }
protobuf = ">=3.20"
pydantic = { version = "^2.10.6", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you confirm that pydantic v1 users aren't bound by our version constraint here if they never add the extra? Maybe the easiest way to confirm is somehow have a sample or test that confirms existing pydantic support?

python = "^3.9"
python-dateutil = { version = "^2.8.2", python = "<3.11" }
types-protobuf = ">=3.20"
Expand All @@ -46,7 +47,6 @@ grpcio-tools = "^1.48.2"
mypy = "^1.0.0"
mypy-protobuf = "^3.3.0"
psutil = "^5.9.3"
pydantic = "^1.10.19"
pydocstyle = "^6.3.0"
pydoctor = "^24.11.1"
pyright = ">=1.1.377"
Expand All @@ -63,6 +63,7 @@ wheel = "^0.42.0"
[tool.poetry.extras]
opentelemetry = ["opentelemetry-api", "opentelemetry-sdk"]
grpc = ["grpcio"]
pydantic = ["pydantic"]

[tool.poetry.group.dev.dependencies]
ruff = "^0.5.0"
Expand Down
108 changes: 108 additions & 0 deletions temporalio/contrib/pydantic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
"""A data converter for Pydantic models

To use, pass ``pydantic_data_converter`` as the ``data_converter`` argument to
:py:class:`temporalio.client.Client`:

.. code-block:: python

client = Client(
data_converter=pydantic_data_converter,
...
)
"""

import inspect
from typing import (
Any,
Type,
)

import pydantic

try:
from pydantic_core import to_jsonable_python
except ImportError:
# pydantic v1
from pydantic.json import pydantic_encoder as to_jsonable_python # type: ignore
Comment on lines +25 to +26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we saying this contrib library works even with pydantic v1? If so, we should change our version constraint to be >= whatever the absolute lowest we support is. If not, we should probably error if they try to import this.


from temporalio.converter import (
AdvancedJSONEncoder,
CompositePayloadConverter,
DataConverter,
DefaultPayloadConverter,
JSONPlainPayloadConverter,
JSONTypeConverter,
)

# Note that in addition to the implementation in this module, _RestrictedProxy
# implements __get_pydantic_core_schema__ so that pydantic unwraps proxied types.


class PydanticModelTypeConverter(JSONTypeConverter):
"""Type converter for pydantic model instances."""

def to_typed_value(self, hint: Type, value: Any) -> Any:
"""Convert dict value to pydantic model instance of the specified type"""
if not inspect.isclass(hint) or not issubclass(hint, pydantic.BaseModel):
return JSONTypeConverter.Unhandled
model = hint
if not isinstance(value, dict):
raise TypeError(
f"Cannot convert to {model}, value is {type(value)} not dict"
)
if hasattr(model, "model_validate"):
return model.model_validate(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this raises a Pydantic-specific exception that a user could add to workflow failure exception types if they wanted? (if not, we should consider a usability improvement to have an option to wrap this in application error)

elif hasattr(model, "parse_obj"):
# pydantic v1
return model.parse_obj(value)
else:
raise ValueError(
f"{model} is a Pydantic model but does not have a `model_validate` or `parse_obj` method"
)


class PydanticJSONEncoder(AdvancedJSONEncoder):
"""JSON encoder for python objects containing pydantic model instances."""

def default(self, o: Any) -> Any:
"""Convert object to jsonable python.

See :py:meth:`json.JSONEncoder.default`.
"""
if isinstance(o, pydantic.BaseModel):
return to_jsonable_python(o)
return super().default(o)


class PydanticPayloadConverter(CompositePayloadConverter):
"""Payload converter for payloads containing pydantic model instances.

JSON conversion is replaced with a converter that uses
:py:class:`PydanticJSONEncoder` to convert the python object to JSON, and
:py:class:`PydanticModelTypeConverter` to convert raw python values to
pydantic model instances.
"""

def __init__(self) -> None:
"""Initialize object"""
json_payload_converter = JSONPlainPayloadConverter(
encoder=PydanticJSONEncoder,
custom_type_converters=[PydanticModelTypeConverter()],
)
Copy link
Contributor Author

@dandavison dandavison Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our pydantic sample was implementing to_payload. This is slightly incorrect because it means that our AdvancedJSONEncoder was not being used. In practice, pydantic provides a function (called to_jsonable_python in v2) with similar functionality to our AdvancedJSONEncoder, but nevertheless, we should ensure that we fall back to AdvancedJSONEncoder in case we add to it in the future. That's what we do here, by using JSONPlainPayloadConverter directly.

super().__init__(
*(
c
if not isinstance(c, JSONPlainPayloadConverter)
else json_payload_converter
for c in DefaultPayloadConverter.default_encoding_payload_converters
)
)


pydantic_data_converter = DataConverter(
payload_converter_class=PydanticPayloadConverter
)
"""Data converter for payloads containing pydantic model instances.

To use, pass as the ``data_converter`` argument of :py:class:`temporalio.client.Client`
"""
24 changes: 11 additions & 13 deletions temporalio/converter.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@
import sys
import traceback
import uuid
import warnings
from abc import ABC, abstractmethod
from dataclasses import dataclass
from datetime import datetime
from enum import IntEnum
from itertools import zip_longest
from typing import (
Any,
Awaitable,
Expand Down Expand Up @@ -291,10 +291,8 @@ def from_payloads(
RuntimeError: Error during decode
"""
values = []
for index, payload in enumerate(payloads):
type_hint = None
if type_hints and len(type_hints) > index:
type_hint = type_hints[index]
type_hints = type_hints or []
for index, (payload, type_hint) in enumerate(zip_longest(payloads, type_hints)):
# Raw value should just wrap
if type_hint == temporalio.common.RawValue:
values.append(temporalio.common.RawValue(payload))
Expand Down Expand Up @@ -558,12 +556,7 @@ def encoding(self) -> str:

def to_payload(self, value: Any) -> Optional[temporalio.api.common.v1.Payload]:
"""See base class."""
# Check for pydantic then send warning
if hasattr(value, "parse_obj"):
warnings.warn(
"If you're using pydantic model, refer to https://github.com/temporalio/samples-python/tree/main/pydantic_converter for better support"
)
# We let JSON conversion errors be thrown to caller
Comment on lines -561 to -566
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, I wonder if there is any way to warn users that are trying to encode Pydantic models without using our suggested converter

# Let JSON conversion errors be thrown to caller
return temporalio.api.common.v1.Payload(
metadata={"encoding": self._encoding.encode()},
data=json.dumps(
Expand Down Expand Up @@ -1523,8 +1516,13 @@ def value_to_type(
# TODO(cretz): Want way to convert snake case to camel case?
return hint(**field_values)

# If there is a @staticmethod or @classmethod parse_obj, we will use it.
# This covers Pydantic models.
# Pydantic model instance
# Pydantic users should use
# temporalio.contrib.pydantic.pydantic_data_converter, in which case a
# pydantic model instance will have been handled by the custom_converters at
# the start of this function. We retain the following for backwards
# compatibility with pydantic users who are not using contrib.pydantic, but
# this is deprecated.
Comment on lines +1519 to +1525
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, wonder if we should warn if they are trying to decode to a Pydantic model without using our suggested converter

parse_obj_attr = inspect.getattr_static(hint, "parse_obj", None)
if isinstance(parse_obj_attr, classmethod) or isinstance(
parse_obj_attr, staticmethod
Expand Down
39 changes: 36 additions & 3 deletions temporalio/worker/workflow_sandbox/_restrictions.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from __future__ import annotations

import dataclasses
import datetime
import functools
import inspect
import logging
Expand All @@ -31,6 +32,14 @@
cast,
)

try:
import pydantic
import pydantic_core

HAVE_PYDANTIC = True
except ImportError:
HAVE_PYDANTIC = False

import temporalio.workflow

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -435,9 +444,10 @@ def with_child_unrestricted(self, *child_path: str) -> SandboxMatcher:
# Due to a metaclass conflict in sandbox, we need zipfile module to pass
# through always
"zipfile",
# This is a very general module needed by many things including pytest's
# Very general modules needed by many things including pytest's
# assertion rewriter
"typing",
"typing_extensions",
# Required due to https://github.com/protocolbuffers/protobuf/issues/10143
# for older versions. This unfortunately means that on those versions,
# everyone using Python protos has to pass their module through.
Expand Down Expand Up @@ -943,7 +953,17 @@ def r_op(obj: Any, other: Any) -> Any:

def _is_restrictable(v: Any) -> bool:
return v is not None and not isinstance(
v, (bool, int, float, complex, str, bytes, bytearray)
v,
(
bool,
int,
float,
complex,
str,
bytes,
bytearray,
datetime.date, # from which datetime.datetime inherits
),
)


Expand Down Expand Up @@ -971,6 +991,8 @@ def __init__(self, *args, **kwargs) -> None:
_trace("__init__ unrecognized with args %s", args)

def __getattribute__(self, __name: str) -> Any:
if HAVE_PYDANTIC and __name == "__get_pydantic_core_schema__":
return object.__getattribute__(self, "__get_pydantic_core_schema__")
Comment on lines +994 to +995
Copy link
Member

@cretz cretz Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmm, trying to think if we even want this generic to void pydantic code inside our sandbox. I guess all ways to do it would require that the contrib library is at least imported by the user somewhere in the process, but maybe that is ok? If that is ok, I wonder if a side-effect of importing the contrib is ideal over altering the sandbox even if pydantic is even available? Though I admit I kinda like the idea of this patch if pydantic is even available regardless of contrib lib usage, so maybe this should remain as is.

state = _RestrictionState.from_proxy(self)
_trace("__getattribute__ %s on %s", __name, state.name)
# We do not restrict __spec__ or __name__
Expand Down Expand Up @@ -1020,6 +1042,17 @@ def __getitem__(self, key: Any) -> Any:
)
return ret

if HAVE_PYDANTIC:
# Instruct pydantic to use the proxied type when determining the schema
# https://docs.pydantic.dev/latest/concepts/types/#customizing-validation-with-__get_pydantic_core_schema__
@classmethod
def __get_pydantic_core_schema__(
cls,
source_type: Any,
handler: pydantic.GetCoreSchemaHandler,
) -> pydantic_core.CoreSchema:
return handler(RestrictionContext.unwrap_if_proxied(source_type))

__doc__ = _RestrictedProxyLookup( # type: ignore
class_value=__doc__, fallback_func=lambda self: type(self).__doc__, is_attr=True
)
Expand All @@ -1032,7 +1065,7 @@ def __getitem__(self, key: Any) -> Any:
)
__str__ = _RestrictedProxyLookup(str) # type: ignore
__bytes__ = _RestrictedProxyLookup(bytes)
__format__ = _RestrictedProxyLookup() # type: ignore
__format__ = _RestrictedProxyLookup(format) # type: ignore
__lt__ = _RestrictedProxyLookup(operator.lt)
__le__ = _RestrictedProxyLookup(operator.le)
__eq__ = _RestrictedProxyLookup(operator.eq) # type: ignore
Expand Down
Loading
Loading