-
-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should serialization check type by default? #234
Comments
Hi @wyfo, Data classes direct serialization is done by orjson like a charm. Type checking is important at some earlier stages.
Please check how just Pydantic's type-checking impacts performance in real-like scenarios. In this benchmark, we use data classes decorated by Pydantic. If the validation decorator is changed to built-ins we have serialization speed equal to primitives (raw benchmarks). https://github.com/mtag-dev/py-rest-stress-testing/tree/squall_integration |
Actually, not really, because orjson only works with raw dataclasses and doesn't cover apischema features like aliasing (for example camelCase), (conditional) field skipping, serialized methods, field ordering, etc. There are some issues opened in orjson repository, for example ijl/orjson#45 about aliasing, but I don't know if they will ever be implemented (and I personally think they should not, it's kind of a slippery slope). For simple dataclasses however, orjson is indeed fine. By the way, apischema has a passthrough feature that allow arbitrary types like dataclasses to be handled directly by the JSON serializer. Moreover, in the original design of this feature, there was an option for dataclass that allows to automatically pass through dataclasses that use no apischema specific feature. This option has been removed before the release to simplify the code in anticipation of the cythonisation. That option should come back in the next release, even if the performance of the new compiled code makes it less useful: apischema (without dataclass passthrough) + orjson is almost as fast than orjson only for simple dataclasses.
For my part, I would more rely on static type checking, but I admit it's quite limited on Python.
Actually, pydantic and apischema are no longer really comparable in terms of performance since the cythonization (coming with the next release). apischema is indeed about 20x faster than pydantic in serialization (using its own benchmark), without type checking; with type checking activated, it still more than 17x. In the end, I didn't understand your opinion on this issue. Do you think type checking should be the default? |
I did not think about it when I was writing the issue, but the best solution could be to have simply |
@wyfo all depends on the stage.
Switching off in production is not an option. Because some wrong values can appear from a data source for many reasons. In this case, issue presence will be opaque and investigation will take much longer. This only works for configs/widgets. Are these improvements already in the master? |
I'll try to integrate apischema in squall ASAP and will provide you feedback based on benchmarks. |
I think you misunderstood this issue. There is no question about deactivating type checking at deserialization. The issue only concerns serialization — that's why the settings key is
Yes, they are, however, installing from Github with |
With Cython, performance penality caused by serialization type checking is less an issue. apischema is already fast enough to afford this feature as default behavior, as it will stay a lot faster than competitors.
With serialization type checking by default, disabling it could then become an option to improve performance.
The text was updated successfully, but these errors were encountered: