introduce scalding-quotation sub-project #1755

fwbrasil · 2017-12-11T21:04:36Z

See #1754 for more details

johnynek

still reading, but wanted to share some initial comments.

johnynek · 2017-12-11T23:25:15Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+package com.twitter.scalding.quotation
+
+sealed trait Projection {
+  def andThen(prop: String, tpe: String): Projection


can we move the default implementation here? Looks like both cases have the same implementation.

I also wonder if we should use some AnyVal wrappers here instead of String. Something like:

case class Accessor(asString: String) extends AnyVal case class TypeName(asString: String) extends AnyVal

johnynek · 2017-12-12T00:21:35Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/ProjectionMacroTest.scala

+  "method with params isn't considered as projection" in {
+    test
+      .function[Person, String](_.name.substring(1))._1
+      .projections.set mustEqual Set(name)


I'm confused what name is here? How is this compiling?

johnynek · 2017-12-12T00:22:31Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/package.scala

+  trait Test extends FreeSpec with MustMatchers
+
+  val person = TypeReference(cls[Person])
+  val name = person.andThen("name", cls[String])


ahh, here it is. Can we not do this and instead add an explicit import? It makes it harder to read the tests without a clear benefit if you ask me.

I've moved the projections to the Person companion object

johnynek

some more comments.

PS: in my view, the idea PR is about 400 lines. This is better since it is smaller, but still quite large.

johnynek · 2017-12-12T17:23:23Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+  def of(tpe: String, superClass: Class[_]): Projections = {
+
+    def byType(p: Projection) = {
+      def loop(p: Projection): Boolean =


can we add the @annotation.tailrec here?

johnynek · 2017-12-12T17:24:06Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+            false
+        }
+
+      def loop(p: Projection): Either[Projection, Option[Projection]] =


can we add @annotation.tailrec?

this method is not tail recursive

typeName in comment instead of tpe?

johnynek · 2017-12-12T17:25:41Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+            Either.cond(!isSubclass(tpe), None, p)
+          case p @ Property(path, name, tpe) =>
+            loop(path) match {
+              case Left(path) =>


can we use Left(_) => here since we are not using path?

johnynek · 2017-12-12T17:35:54Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+      loop(p)
+    }
+
+    def bySuperClass(p: Projection) = {


can we put the return type? I think Option[Projection]?

johnynek · 2017-12-12T17:40:18Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+      p match {
+        case TypeReference(tpe) =>
+          base match {
+            case TypeReference(`tpe`) => Some(p)


Some(p) == Some(base) in this branch right? Can we write it that way so we can see in both branches we are returning Some(base) or None. In fact, maybe we should return Boolean instead so it is easier to follow and use filter below?

loop here does more than just filter the projection. It reconstructs the projections on top of the base projections using the type as the matching criteria. Take a look at line 99, where it uses option.map to recreate the property on top of the new base.

right. I missed that branch. But line 95 could be Some(base) right (since we are checking equality)?

Yes, it could. It doesn't seem to matter, though?

I think it is easier to read that both branches are returning the same thing. I feel that is a value to someone understanding the algorithm.

Makes sense. I've changed it

johnynek · 2017-12-12T17:44:21Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+  def apply(set: Set[Projection]) = {
+    new Projections(
+      set.filter {
+        case Property(path, name, tpe) => !set.contains(path)


not clear to me this algorithm works. It seems the order you add things in matters here. It seems you need to add shortest paths first and just going through the set may cause you to add a path that later you will not want in there.

Do you follow my point?

Also, the property you want to maintain, how are we maintaining it through ++? Can't you have two different projections in two different sets that need to have one filtered after a ++? Can you spell out the invariant above the class in the comments?

I'm not sure what you mean by ordering since it filters an unordered set based on the same complete unordered set, but indeed this algorithm has a bug. It doesn't consider nested intersections. For instance, person.contact and person.contact.phone.number would return both projections, which is wrong. I've just fixed it.

The invariant holds for ++ since it also uses the apply method. I'm not sure the invariant needs to be in the class comments since it's already in the apply method, but I don't feel strongly about it.

johnynek · 2017-12-12T17:45:01Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+      })
+  }
+
+  def flatten(list: List[Projections]): Projections =


why not Iterable here since ++ is commutative right? We don't care about an particular order do we?

CLAassistant · 2017-12-12T22:05:59Z

All committers have signed the CLA.

johnynek · 2017-12-13T00:33:01Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+            false
+        }
+
+      def loop(p: Projection): Either[Projection, Option[Projection]] =


johnynek · 2017-12-13T00:34:07Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+      p match {
+        case TypeReference(tpe) =>
+          base match {
+            case TypeReference(`tpe`) => Some(p)


right. I missed that branch. But line 95 could be Some(base) right (since we are checking equality)?

johnynek · 2017-12-13T00:37:51Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/ProjectionMacro.scala

+            val paramsProjecttions = params.flatMap(Projection.unapply)
+            q"""
+              $func match {
+                case f: com.twitter.scalding.quotation.QuotedFunction =>


do we need to use _root_.com here? I thought we needed to.

It's good to have it for hygiene. If the user has a com object it'll fail, for instance. Note that even _root_ doesn't guarantee hygiene because that's a value that the user could define, but that should never happen :)

I'll add _root_ to all trees

johnynek · 2017-12-13T00:39:38Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/ProjectionMacro.scala

+
+          val inputSymbols = inputs.map(_.symbol).toSet
+
+          object Projection {


why is this an object rather than just an def unapply(t: Tree): Option[Tree] method? Can you comment?

johnynek · 2017-12-13T00:41:05Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/ProjectionMacro.scala

+          }
+
+          def functionCall(func: Tree, params: List[Tree]) = {
+            val paramsProjecttions = params.flatMap(Projection.unapply)


I think Projections is spelled wrong.

johnynek · 2017-12-13T00:41:50Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/ProjectionMacro.scala

+              functionCall(func, params)
+            case q"$func(..$params)" if isFunction(func) =>
+              functionCall(func, params)
+            case t @ Projection(p) =>


okay, I guess you are using unapply here. Can we just comment on the object.

I've renamed it to ProjectionExtractor. Is it clear enough to avoid the comment?

johnynek · 2017-12-13T00:44:23Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/QuotedMacro.scala

+        .exists(c.enclosingPosition.source.path.contains)
+
+    def isScalding(sym: Symbol): Boolean =
+      sym.fullName.contains("com.twitter.scalding") || {


don't we want startsWith here?

good catch. Fixed

johnynek · 2017-12-13T00:51:41Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/ProjectionMacroTest.scala

+    "non-quoted" in {
+      val function = (p: Person) => p.name
+      test.function[Person, String](function)._1
+        .projections.set mustEqual Set(Person.typeReference)


in this case, does typeReference mean we are keeping the whole thing (when we see that, we can't do any projections)?

Yes, it must assume that all properties are used since the function isn't introspectable

johnynek · 2017-12-13T00:53:27Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/ProjectionTest.scala

+        val p = Projections(Set(p1)) ++ Projections(Set(p2))
+        p.set mustEqual Set(p1, p2)
+      }
+      "with merge" in {


seems like we could do a scalacheck test here. We could test that for all List[Projections] then if we ++ all of them, we have a final result where there are not Projections which are "suffixes" of another.

I'm not a big fan of property-based testing. Is there something that would have better coverage if I use it?

WHAT!?!? Who is not a big fan of property based checks!? Why don't you like it? I love it. It gives you an ability to clearly state properties and a decent approach at checking if what you claim is true actually is. Of course, with poor generators, it often doesn't hit tricky cases, but I have found it to be hugely helpful.

If the code is refactored into simple functions that don't require complex structures, unit tests are easier to implement and reason about. Property-based tests take longer to implement, are much harder to debug, and don't bring many benefits if compared to simple unit tests.

I guess the reason I brought it up is that I thought I noticed a bug which I think a scalacheck would have caught.

Sure, if I had thought about that scenario and checked the property :) The same is valid for unit tests. I can implement the property-based test if you feel strongly about it, though.

johnynek · 2017-12-13T00:54:22Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/ProjectionTest.scala

+        Projections.empty.toString mustEqual "Projections()"
+      }
+      "non-empty" in {
+        Projections(Set(p1, p2)).toString mustEqual "Projections(T1.p1, T2.p2)"


Another scalacheck might be that Projections(someSet).toSet.size <= someSet.size

johnynek · 2017-12-14T01:05:11Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/TreeOps.scala

+   * collected tree.
+   */
+  def collect[T](tree: Tree)(f: PartialFunction[Tree, T]): List[T] = {
+    var res = List[T]()


what about using a List.newBuilder[T] and using += on it which I think gives O(1) append (because they cheat and mutate the list). Currently I think this code is O(N^2) since it is using append on a list.

johnynek · 2017-12-14T01:07:23Z

scalding-quotation/src/test/scala/com/twitter/scalding/quotation/LimitationsTest.scala

+
+  "nested transitive projection" in pendingUntilFixed {
+    test.function[Person, Option[String]](_.alternativeContact.map(_.phone))._1.projections.set mustEqual
+      Set(Person.typeReference.andThen(Accessor("alternativeContact"), typeName[Option[Contact]]).andThen(Accessor("phone"), typeName[String]))


this confuses me. How do we distinguish an andThen projection inside or outside an Option? Seems like we need to record the input type we are dealing with also. We have Option[Contact] as the output of the previous Projection, but how do we jump across the .map on the Option?

Note that this is a failing test with pendingUntilFixed. The projections in this scenario are Person.alternativeContact and Contact.phone, which are unrelated and thus don't get filtered out by Projections. Ideally, the macro should apply a transformation similar to beta reduction to produce only the projection Person.alternativeContact.phone.

fwbrasil · 2017-12-18T18:24:52Z

@johnynek @benpence @dieu @ttim Could someone merge this? I don't have access. I'd like to move forward and submit the second PR.

ttim

I guess the only real concern from me - Projections sounds more like Map[ParamName, Set[Projections]] and not like Set[Projection].

Feel free to address comments in subsequent reviews.

ttim · 2017-12-18T19:22:54Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Liftables.scala

+
+import scala.reflect.macros.blackbox.Context
+
+trait Liftables {


Can we add comment on what Liftable means?

ttim · 2017-12-18T19:28:04Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+/**
+ * A projection property (e.g. `Person.name`)
+ */
+final case class Property(path: Projection, accessor: Accessor, typeName: TypeName) extends Projection {


typeName is type of this Projection? It sound like something which belongs to any Projection, does it make sense to put it on the level of Projection then?

I don't think so, there isn't immediate benefit in abstracting it and the typeName of a TypeReference doesn't really have the same meaning as the typeName of a Property.

My point more about typeName being part of any Projection in some way (I guess it's type of Projection). I definitely don't want to introduce it as an abstraction, more like a property which we have for any Projection. If you can see it as something non consistent between implementations I'm ok with it.

ttim · 2017-12-18T19:50:23Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+            false
+        }
+
+      def loop(p: Projection): Either[Projection, Option[Projection]] =


typeName in comment instead of tpe?

ttim · 2017-12-18T19:55:56Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+   * Returns the projections that are based on `tpe` and limits projections
+   * to only properties that extend from `superClass`.
+   */
+  def of(typeName: TypeName, superClass: Class[_]): Projections = {


of sounds like method on companion object for me and I was reading this method this way. Maybe filterBySuperClass?

filterBySuperClass doesn't represent well what this method does. Would you have another suggestion? I find of a good name to be honest.

I like of, but I perceive it as something on companion objection like Projections.of(clazz).

ttim · 2017-12-18T19:58:16Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+   */
+  def of(typeName: TypeName, superClass: Class[_]): Projections = {
+
+    def byType(p: Projection) = {


Can we do rootProjection(Projection): TypeReference and match on it? I think logic will be clearer then.

ttim · 2017-12-18T20:09:41Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+      }
+    }
+
+    Projections(set.filter(byType).flatMap(bySuperClass))


I read this couple of times and have issues with understanding what bySuperClass does. Can we add a comment on it?

And/or rename it and put on Projection itself?

ttim · 2017-12-18T20:14:54Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/Projection.scala

+   * returns the projection
+   *   `Person.name.contact`
+   */
+  def basedOn(base: Set[Projection]): Projections = {


Clarifying for myself: we loose correspondence between function arguments and their Projections and because of that we do this match between all base and all set projections?

Can we rename loop(base, p) and put it to Projection class?

ttim · 2017-12-18T20:26:24Z

scalding-quotation/src/main/scala/com/twitter/scalding/quotation/ProjectionMacro.scala

+
+    val nestedList =
+      params.flatMap {
+        case param @ q"(..$inputs) => $body" =>


This is kinda hard to read because you can't easily see alternatives together, can we extract bodies into separate functions?

fwbrasil · 2017-12-20T17:57:06Z

@ttim I addressed your feedback on #1761

I guess the only real concern from me - Projections sounds more like Map[ParamName, Set[Projections]] and not like Set[Projection].

We could use beta reded uction to deal with projections based on the actual parameters. It doesn't seem necessary, though. It would make the code much more complex without clear benefit since the projections are aggregated at the end and it doesn't really matter from which parameter they came from.

ttim · 2017-12-20T19:40:47Z

We could use beta reded uction to deal with projections based on the actual parameters. It doesn't seem necessary, though. It would make the code much more complex without clear benefit since the projections are aggregated at the end and it doesn't really matter from which parameter they came from.

I don't think it complicates a logic anyhow significantly, but it makes part where you combine projections between function call and function body more straightforward and correct.

fwbrasil · 2017-12-20T19:57:21Z

I don't think it complicates a logic anyhow significantly, but it makes part where you combine projections between function call and function body more straightforward and correct.

The projection logic is based on type references right now. In order to do detect which projection is from which parameter, we'd need to introduce a logic similar to beta reduction. I don't see how it'd be more correct or precise than the current implementation.

@ttim

Address @ttim's feedback on #1755

fwbrasil mentioned this pull request Dec 11, 2017

[WIP] introduce new scalding-quotation module #1754

Open

fwbrasil force-pushed the scalding-quotation branch from 2c74d33 to 65982d7 Compare December 11, 2017 21:18

johnynek reviewed Dec 12, 2017

View reviewed changes

fwbrasil force-pushed the scalding-quotation branch from 65982d7 to 0011bff Compare December 12, 2017 22:05

johnynek reviewed Dec 13, 2017

View reviewed changes

fwbrasil force-pushed the scalding-quotation branch 2 times, most recently from 7cc1994 to c0d1897 Compare December 13, 2017 04:43

johnynek reviewed Dec 14, 2017

View reviewed changes

introduce scalding-quotation sub-project

d7463fb

fwbrasil force-pushed the scalding-quotation branch from c0d1897 to d7463fb Compare December 14, 2017 01:58

johnynek approved these changes Dec 14, 2017

View reviewed changes

johnynek merged commit 0614b51 into twitter:develop Dec 18, 2017

ttim reviewed Dec 18, 2017

View reviewed changes

fwbrasil mentioned this pull request Dec 20, 2017

scalding-quotation refactorings #1761

Merged

ttim pushed a commit that referenced this pull request Dec 22, 2017

scalding-quotation refactorings (#1761)

de3948c

Address @ttim's feedback on #1755


		val inputSymbols = inputs.map(_.symbol).toSet

		object Projection {


		import scala.reflect.macros.blackbox.Context

		trait Liftables {

introduce scalding-quotation sub-project #1755

introduce scalding-quotation sub-project #1755

Conversation

fwbrasil commented Dec 11, 2017

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnynek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Dec 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwbrasil commented Dec 18, 2017

ttim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fwbrasil Dec 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CLAassistant commented Dec 12, 2017 •

edited

Loading

fwbrasil Dec 20, 2017 •

edited

Loading