-
-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce @flatten annotation #642
Conversation
# Conflicts: # upickle/test/src/upickle/MacroTests.scala # upickleReadme/Readme.scalatex
upickle/implicits/src-2/upickle/implicits/internal/Macros.scala
Outdated
Show resolved
Hide resolved
@nox213 this looks great. Could you update the PR description with a summary of the implemented semantics, with examples, as well as a summary of the implementation strategy? That would make the PR much easier to review and merge |
Sure, I’ll tag you when I’m ready. |
upickle/implicits/src-2/upickle/implicits/internal/Macros.scala
Outdated
Show resolved
Hide resolved
@lihaoyi I polished the code a little and updated the PR. |
@nox213 at I high level, would it be possible to implement this without the |
I guess more generally I'd like to see whether we can land this without breaking binary compatibility. Breaking bincompat and releasing Mill 5.0.0 is an option, but it would be better if we didn't need to do that |
I think this could work:
|
For the Scala 2 vs Scala 3 approach, could we use the Scala 3 approach in Scala 2 as well? It's best to keep the two implementations mostly consistent, so future changes can be done more easily across both versions |
Can this be fixed by making it |
How hard would it be to generalize this to other collection types and key types?
|
Left some high level comments. It's a pretty invasive change so I expect it will take some time to discuss the changes, first at a high-level and then down to the lines of code being changed, just to set expectations. But overall the approach looks solid |
@lihaoyi Thanks for your review.
I'm not sure whether it's possible. I will investigate more on it.
I believe this is doable. For other collection types, I could store the type of collection that is flatten and then use that type to construct it from Reader. For keys, I didn't do it because I thought that's not the use cases. I will try this.
Sure, I will use the Scala 3 approach in Scala 2 (no |
The challenging part seems to be bincompat. Would the best approach be to implement @flatten without changing the existing function signatures then? And if changing the function signatures is unavoidable, the next best option would be to keep the existing functions (marking them as deprecated) and implement separate macros (e.g., a new Macro class in Scala 2). In this case, would it make sense to introduce a conditional branch in specific functions to decide whether to call the deprecated macro or the new one? For example, determining this based on the presence of the @flatten annotation. Did I understand this correctly? One thing I’m curious about: as I understand it, macro code is only called internally, so I assumed it would be safe in terms of bincompat. However, can even functions that are not called externally impact bincompat? |
For bincompat, we could leave the existing |
I think what you would need to do is take the type of the collection, resolve the For keys, I think you should be able to do the same thing: take the type, resolve the |
Thanks for your explanation and tips. To summarize the direction I need to take for the changes:
Is this the right direction to proceed? Lastly, for Macros2.scala, would it be better to apply private or [package] private as much as possible if those restrictions can be applied? |
@nox213 yes that's right. Making the new macros package private would indeed help, though some of the supporting runtime APIs may need to remain public so we won't be able to privatize everything |
Okay, I will work on it. |
d79bf49
to
f33336d
Compare
wip make scala2 works Delete some files Add more tests more tests scala3 writer fix polish clean up Add helper function scala3 macro readers fix bug test pass polish polish wip fix remove unused Use ListBuffer to preserve orders polish Keep existing macros for bincompat Keep formatting
fc3ba46
to
1ecc8df
Compare
What I tried was that val writeRef = Ref(Symbol.requiredMethod("upickle.default.write"))
(keyTpe0.asType, valueTpe0.asType, keyWriterTpe0.asType, valueWriterTpe0.asType) match {
case ('[keyTpe], '[valueTpe], '[keyWriterTpe], '[valueWriterTpe]) =>
val typeAppliedWrite = TypeApply(writeRef, List(TypeTree.of[keyTpe]))
val snippet =
if (keyTpe0 =:= TypeRepr.of[String])
... // be ommitted
else
'{
${select.asExprOf[Iterable[(keyTpe, valueTpe)]]}.foreach { (k, v) =>
${self}.writeSnippetMappedName[R, valueTpe](
${ctx},
${Apply(
Apply(
typeAppliedWrite,
List('{k}.asTerm, '{-1}.asTerm, '{false}.asTerm, '{false}.asTerm)
),
List(
'{summonInline[keyWriterTpe]}.asTerm
)
)
.asExprOf[String]},
summonInline[valueWriterTpe],
v,
)
}
} I used
Those two types are treated as different type because their path is different. I got stuck at this point. |
I misunderstood your comment. I tested with your fix but it still doesn't work. object FlattenTestWithType {
implicit def rw[T: RW]: RW[FlattenTestWithType[T]] = upickle.default.macroRW // compile error
} I’ve pushed the changes to support collections other than Map (iterable flattening is now possible) so far. a07839d |
I just saw your ping now. Let me know if there's still something I can help with |
Would you consider merging this PR without the feature for now? The PR is already quite large, and if we come up with a good solution later, we could address it in a separate follow-up. |
Sure! Would be happy to merge it first Give me some time to properly review it. Still probably need @jodersky 's help to look through some of the scala 3 macro stuff, and I'll give it a pass as well |
I read through and think it looks reasonable. I don't have any confidence I'll be able to catch subtle bugs in the implementation, but we have the existing test suite for that so hopefully this doesn't break anything One main request: can we have some tests for the edge cases and failure modes of
case class Foo(x: Int)
case class Bar(x: String, @flatten foo: Foo)
case class Foo(x: Int)
case class Bar(x: String)
case class Qux(@flatten foo: Foo, @flatten bar: Bar)
case class Foo(@flatten x: Int)
case class Foo(@flatten x: Seq[(Seq[Int], Int)]) I don't really have any opinion on the exact semantics in these situations, but regardless of what we choose we should have unit tests asserting their success or their failure and error message (these can go in |
It fails with following errors. Before process the flatten annotation, macro checks whether there are duplicated
Yes, this is possible, but my implementation only supports flattening multiple case classes and a single collection at a time.
This also fails with an same error above. I added a few more tests to cover the parts that were not addressed by the tests I had previously created (2eb1ab8). |
@lihaoyi While adding some tests, I thought of a case where a collection could lead to runtime collisions. For example: case class Foo(i: Int, @flatten m: Map[String, String]) If m = Map("i" -> "value"), there would be a key collision, but this can only be detected at runtime. Referring to the behavior of the Rust library linked in the issue, here’s how it handles such cases: {"i": 1, "i": "value"} And when reading it back, it fails with an error like:
On the other hand, if the map field appears first, like this: case class Foo(@flatten m: Map[String, String], i: Int) The generated JSON would look like this: {"i": "value", "i": 1} In this case, when trying to read it back, it would result in a runtime error while attempting to convert "value" to an integer. In upickle, it behaves almost the same except for the first case.
my implementation produces
|
I think the current behavior for the case you mentioned is reasonable, but would it be possible to make it raise an error on write? That would probably be more intuitive to users than generating JSON a dictionary with duplicate keys |
I made it raise an error on write and added a failure test case as well (e4071f6)
|
I have limited experience in Scala 3 macros, so probably not a great reviewer of this issue, however posing some general questions:
|
It's a temporary name that will be changed to
It's a class that existed before which is replaced by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Scala 3 macro code looks reasonable. I also appreciate that you cleaned up a bunch of stuff and restricted visibilities of internal methods! I do have some comments on them however, mostly minor issues and questions to help my own understanding.
I also second Haoyi, in that I can't guarantee that there aren't any subtle bugs, but we can use the test suite for that. Speaking of, do we have some kind of performance tests we can use to weed out regressions? From what I see in the code, most flatten-specific macro parts are behind a flag (if the annotation is present), so I don't think there will be any regressions, but it's always good to test this stuff.
Regarding supporting of non-string keys: I think it's probably best to tackle that in a separate PR later on too; this one is already quite large.
To your specific question, @nox213
'{
${select.asExprOf[Iterable[(keyTpe, valueTpe)]]}.foreach { (k, v) =>
${self}.writeSnippetMappedName[R, valueTpe](
${ctx},
upickle.default.write(key)(summonInline[keyWriterTpe]), // this doesn't compile
summonInline[writerTpe],
v,
)
}
}
one thing you could try is take the outer "API trait" as a parameter (similar to what is done in writeSnippets(..., inline thisOuter: upickle.core.Types with upickle.implicits.MacrosCommon...)
), and use some typecasting if necessary.
Maybe something like
def foo(api: Expr[upickle.core.Types with upickle.implicits.MacrosCommon]) =
${outer}.write(key.asInstanceOf[Any])(summonInline[keyWriterTpe])
} | ||
} else report.errorAndAbort(s"${typeSymbol} is not a case class or a immutable.Map") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: errorAndAbort
will abort the macro compilation and prevent reporting any other errors.
Would a simple report.error
, followed by returning an empty list help catch other errors here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeSymbol.isClassDef && typeSymbol.flags.is(Flags.Case) | ||
} | ||
|
||
private def extractKeyValueTypes(using Quotes)(tpe: quotes.reflect.TypeRepr): (quotes.reflect.TypeRepr, quotes.reflect.TypeRepr, quotes.reflect.TypeRepr) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first value of this function's return value (tycon) is never used in any call sites, I think it can be removed.
It's also not quite obvious to me why this does a nested extraction, so it might be helpful to add a quick comment what it's supposed to do.
I'm doing a first quick review, but if I understand correctly, this basically matches both Col[A,B]
and Col1[Col2[A,B]]
and returns A
and B
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fc02f23
Yes, you're right. My intention was that extract K, V from collections like
Map[K, V]
Iterable[(K, V)]
val collisions = ${select.asExprOf[Iterable[(String, valueTpe)]]}.map(_._1).toSet.intersect(${allKeysExpr}) | ||
if (collisions.nonEmpty) { | ||
throw new Exception("Key collision detected for the following keys: " + collisions.mkString(", ")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be done statically in the macro, rather than as an expression which is evaluated at run time? allKeysExpr
is made up of literal expressions, and select
is passed in, so I think they should be statically known
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select
refers to a collection, and we check whether its keys of elements conflict with existing keys. It seems we cannot perform this check at compile time, since we don’t know what the collection contains until runtime.
case _ => | ||
report.errorAndAbort(s"Unsupported type $typeSymbol for flattening") | ||
} | ||
} else report.errorAndAbort(s"${typeSymbol} is not a case class or a immutable.Map") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immutable.Map
should be Iterable[(String, _)
, according to the check in the if
statement
case _ => | ||
report.errorAndAbort("Unsupported type for flattening") | ||
} | ||
} else report.errorAndAbort(s"${typeSymbol} is not a case class or a immutable.Map") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immutable.Map
should be Iterable[(String, _)
, according to the check in the if
statement
@jodersky Thanks for the reviews! I addressed your reviews.
This can only be done when there are no cyclic dependencies, right? If I understand correctly, the project that defines the ‘API trait’ depends on upickle.implicits. |
Thanks @nox213! I have merged the PR and tagged version 4.1.0 which should go out in the next hour or two Please send me an email with your international bank transfer details so I can send you the bounty payment |
Thank you! I've sent the required information via email. I hope no major bugs arise. |
closes #623
This PR introduces the
@flatten
annotation.Usage
The
@flatten
annotation can only be applied to:case class
es: Flatten fields of a nested case class into the parent structure.Iterable
: Flatten key-value pairs of a Iterable[(String, _)] into the parent structure.Nested flattening allows you to apply the
@flatten
annotation recursively to fields within nested case classes.The Reader also recognizes the
@flatten
annotation.For collection, during deserialization, all key-value pairs in the JSON that do not directly map to a specific field in the case class are attempted to be stored in the Map.
If a key in the JSON does not correspond to any field in the case class, it is stored in the collection.
If there are no keys in the JSON that can be stored in the collection, it is treated as an empty collection.
If a key’s value in the JSON cannot be converted to the Map’s value type (e.g., String), the deserialization fails.
Limitations
@flatten
annotation on aIterable
, the type of key must be String.Implementation Strategy
Writer Derivation
From my understanding, deriving a Writer for a case class involves implementing a dispatch function that iterates through each field of the case class. In the existing implementation, the Writer is generated by processing the information of each field in the type T being derived.
When deriving a Writer, the above snippet shows how the visit function is invoked for each field, passing the corresponding field’s Writer as an argument. If the field is a Map or a case class and needs to be flattened, additional processing is required:
Map
, iterate through all key-value pairs in the Map, calling visit for each pair:Reader Derivation
Deriving a Reader is more complex, especially with support for Map, as it introduces several edge cases. A Reader is essentially a Visitor for JSON, as I understand it. The process involves three main steps:
To support flattening, additional steps were required in these stages:
Flattened fields must be accounted for in the key-to-index mapping. For example:
Special Case for Maps: