Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

differentiate between default=nil and an unset default #23

Closed
wants to merge 3 commits into from

Conversation

oxg558
Copy link
Contributor

@oxg558 oxg558 commented Feb 7, 2025

The JSON-encode protocol for the schema objects removes fields with values of nil, but this means that it is impossible to make a schema with default=nil, which is necessary for schema evolution. This PR edits the protocol to include default=nil only when it was explicitly written in the source .exs defintion files.


Also stops generated from_avro_map functions from requiring optional fields to be present in the avro map by pattern matching in the header.

e.g. for a record with an optional list, record_list, previously we generated:

  @impl true
  def from_avro_map(
        %{
          "record_list" => record_list
        } = _avro_map
      ) do
    {:ok,
     %__MODULE__{
       record_list: decode_record_list(record_list)
     }}
  end

  @required_keys MapSet.new(["record_list"])
  def from_avro_map(%{} = invalid) do
    actual = Map.keys(invalid) |> MapSet.new()
    missing = MapSet.difference(@required_keys, actual) |> Enum.join(", ")
    {:error, "Missing keys: " <> missing}
  end

  def from_avro_map(_) do
    {:error, "Expected a map."}
  end

which enforces the presence of the record_list key, but now we do:

  @impl true
  def from_avro_map(%{} = avro_map) do
    record_list = avro_map["record_list"]

    {:ok,
     %__MODULE__{
       record_list: decode_record_list(record_list)
     }}
  end

  def from_avro_map(_) do
    {:error, "Expected a map."}
  end

@@ -57,6 +57,11 @@ defmodule Avrogen.Avro.Types.Record.Field do
def comment(%__MODULE__{doc: nil, name: name}), do: "#{name}: #{name}"
def comment(%__MODULE__{doc: doc, name: name}), do: "#{name}: #{doc}"

# The only way to differentiate between "default is nil because there isn't one" and "default is actually null" is to
# check if the type is also a union with null.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this way if we have a union with a null and we don't specify any default value, then we'll assume the default value is null, which I'm not sure is correct, as it's technically a different behavior (even though it's probably the desired one most of the times) :/

Also, AFAIU we also support setting them as "null", even though that doesn't sound correct to me either, given it will prevent us from setting the string "null" as a default value 🙃
https://github.com/primait/avrogen/blob/master/lib/avrogen/avro/types/record/field.ex#L108

Personally I think a more correct way to represent them would be a non-nil atom like :null, but I may very well be missing something here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I'm assuming that if we have a union of null and something else the default should always be null rather than the other type in the union, while the Avro docs say:

Default values for union fields correspond to the first schema in the union

So perhaps here instead of doing has_member? I should check if null is the first in the list?

re the :null point, that's what I thought too, but I'm not sure if it's worth the extra effort to do that refactor. I think someone wanting the string "null" as the default is pretty unlikely, though plausible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think a more correct way to represent them would be a non-nil atom like :null, but I may very well be missing something here

I did this in the end :D

@oxg558 oxg558 force-pushed the oli-optional-fields branch from 4f381af to c80b083 Compare February 7, 2025 16:26
@oxg558 oxg558 force-pushed the oli-optional-fields branch from d4d0759 to 2d1f41d Compare February 25, 2025 11:25
@oxg558
Copy link
Contributor Author

oxg558 commented Mar 12, 2025

This is a nice-to-have but since we only ever use the encode/decode_schemaless functions, we can't do regular schema evolution anyway, so it's probably not worth the risk of breaking stuff by changing existing schemas.

@oxg558 oxg558 closed this Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants