Skip to content

Update-schema: Add support for initial-default #1770

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Mar 6, 2025

Rationale for this change

This allows for V3 initial defaults.

This PR took a bit longer than anticipated, mostly because the Pydantic json deserialization. There is a certain way we need to serialize python types to JSON single value encoding.

Are these changes tested?

Added new tests

Are there any user-facing changes?

After this PRs initial defaults can be set through the API. This enables users to add required fields.

@Fokko Fokko force-pushed the fd-add-initial-default-to-update-schema branch from ed357e5 to 388580a Compare March 6, 2025 11:26
@sungwy
Copy link
Collaborator

sungwy commented Mar 15, 2025

@Fokko the PR looks good to me: I think we may just have missed including the new properties in the rename_column method. I agree that we could introduce the ability to update write_default in a different PR

@Fokko Fokko added the changelog Indicates that the PR introduces changes that require an entry in the changelog. label Mar 17, 2025
@Fokko Fokko marked this pull request as draft March 17, 2025 13:31
Fokko added a commit to Fokko/iceberg-python that referenced this pull request Mar 25, 2025
Right now we deserialize the JSON into a dict, which is then passed
into the Pydantic model. It is better to fully delegate this to
pydantic because it is probably faster, and we can detect when
models are created from json or from Python dicts.

Required by apache#1770
Fokko added a commit that referenced this pull request Mar 25, 2025
# Rationale for this change

Right now we deserialize the JSON into a dict, which is then passed into
the Pydantic model. It is better to fully delegate this to pydantic
because it is probably faster, and we can detect when models are created
from json or from Python dicts.

Required by #1770

This is also a recommendation by Pydantic itself:
https://docs.pydantic.dev/latest/concepts/performance/#in-general-use-model_validate_json-not-model_validatejsonloads

# Are these changes tested?

Existing tests

# Are there any user-facing changes?

No

<!-- In the case of user-facing changes, please add the changelog label.
-->
@Fokko Fokko marked this pull request as ready for review March 26, 2025 14:23
@kevinjqliu kevinjqliu added the V3 label Mar 26, 2025
@Fokko Fokko requested a review from sungwy March 26, 2025 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog Indicates that the PR introduces changes that require an entry in the changelog. V3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants