Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sum encoding TaggedFlatObject #828

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

poscat0x04
Copy link

@poscat0x04 poscat0x04 commented Jan 18, 2021

Closes #827

This PR adds a SumEncoding variant TaggedFlatObject :: String -> SumEncoding for encoding and decoding JSON objects whose tags are on the same level as the contents (hence the name "flat"). The decoding/encoding rule of TaggedFlatObject is as follows:

  1. Encode single constructor data types without field names as an array (This is the same as TaggedObject)
  2. Encode single constructor data types with field names as a plain record (This is the same as TaggedObject)
  3. Encode multi constructor data types without fields names as a record whose keys are the positions of the fields(starting from 1) and values are the fields encoded as JSON with an additional KV pair "$tagFieldName": "$constructorName".
    For example A 1 2 will be encoded as {"tag": "A", "1": 1, "2": "2" }.
    Note that the tag will overwrite any fields that have the same name.
  4. Encode multi constructor data types with field names as a record whose keys are the field names and values are the fields encoded as JSON with an additional KV pair "$tagFieldName" : "$constructorName"
    For example A { field1 = 1, field2 = "2" } will be encoded as {"tag": "A", "field1": 1, "field2": "2" }.
    Note that the tag will overwrite any fields that have the same name.

@poscat0x04 poscat0x04 marked this pull request as ready for review January 19, 2021 07:10
@poscat0x04 poscat0x04 force-pushed the add-sum-encoding branch 2 times, most recently from a24c404 to c836475 Compare January 22, 2021 10:38
@poscat0x04
Copy link
Author

poscat0x04 commented Jan 22, 2021

It should now work with GHC 7.* (not sure why my force push didn't trigger CI build)

@poscat0x04
Copy link
Author

For some reason, the INCOHERENT pragma did not work for ghc-7.8.4.

@poscat0x04
Copy link
Author

@phadej Is this ok now?

Data/Aeson/TH.hs Outdated Show resolved Hide resolved
Copy link
Collaborator

@bergmark bergmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encode single constructor data types with field names as a plain record (This is the same as TaggedObject)

"single constructor data types with field names" = record?
No $tagName here, or what does "plain record" mean? If there's precedent for this from TaggedObject then perhaps we should keep this, but why does TaggedObject work like that? :-S

Encode multi constructor data types without fields names as a record whose keys are the positions of the fields(starting from 1) and values are the fields encoded as JSON with an additional KV pair "$tagFieldName": "$constructorName". For example A 1 2 will be encoded as {"tag": "A", "1": 1, "2": "2" }.

I think starting at 0 would make more sense as it matches the indices when we use arrays. I would also make the indices ints so that this is closer to an array. Or is there precedent for this elsewhere that I missed?

Note that the tag will overwrite any fields that have the same name.

Can we give an error if this occurs instead, at least in TH? Or will it be more confusing if TH/Generics differ here?

I think there should be some regression tests on all the functionality. We have some in NullaryConstructors.hs now but It's hard to tell how this will behave with other shapes of data types.

@poscat0x04
Copy link
Author

No $tagName here, or what does "plain record" mean?

Plain record just means a record without any additional field.

I think starting at 0 would make more sense as it matches the indices when we use arrays.

Yeah, that makes sense. TBH the current choice of encoding constructors without field names is a bit arbitrary. I initially did not plan to support them but then figured out it would affect other SumEncoding variants because we currently cannot distinguish different SumEncodings at the type level (relevant code).

I would also make the indices ints so that this is closer to an array.

I'm not sure if I understand. JSON keys must be strings, right?

Can we give an error if this occurs instead, at least in TH? Or will it be more confusing if TH/Generics differ here?

Yeah, I think they should have the same behavior.

@bergmark
Copy link
Collaborator

I'm not sure if I understand. JSON keys must be strings, right?

Guess who has been doing too much JavaScript lately runs away, please disregard

@bergmark bergmark self-requested a review February 26, 2021 13:06
Copy link
Collaborator

@bergmark bergmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :-)

I actually need this now for a rust API I'm testing!! Should be usable with #[serde(tag = ...)]

@bergmark
Copy link
Collaborator

I managed to hit some github hot key that posted my review for me... @phadej do you want to look over this again?

@bergmark
Copy link
Collaborator

Eh it looks like the tests don't compile, but only on GHC 7.8 :-/

@phadej
Copy link
Collaborator

phadej commented Feb 26, 2021

The issue test-failure issue might be something which will get fixed with #839. I think I can wipe out all use over overlapping instances for good. Just need to get to it. (And get courage to commit a breaking change).

@bergmark
Copy link
Collaborator

You can do it!

@tadfisher
Copy link

What's the status of this PR? It would be a useful feature for our codebase, FWIW.

@phadej
Copy link
Collaborator

phadej commented Nov 18, 2021

My current opinion is to throw Options into /dev/null and rather provide a bunch of newtypes to be used with deriving via, for different encoding schemes.

Current generics and template haskell code is very much unmaintainable (as it tries to do evertything), so this PR is probably not going anywhere anytime soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a SumEncoding option that's somewhere between a TaggedObject and an UntaggedValue
4 participants