-
Notifications
You must be signed in to change notification settings - Fork 708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce scalding-quotation sub-project #1755
Conversation
2c74d33
to
65982d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still reading, but wanted to share some initial comments.
package com.twitter.scalding.quotation | ||
|
||
sealed trait Projection { | ||
def andThen(prop: String, tpe: String): Projection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move the default implementation here? Looks like both cases have the same implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wonder if we should use some AnyVal
wrappers here instead of String
. Something like:
case class Accessor(asString: String) extends AnyVal
case class TypeName(asString: String) extends AnyVal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"method with params isn't considered as projection" in { | ||
test | ||
.function[Person, String](_.name.substring(1))._1 | ||
.projections.set mustEqual Set(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused what name
is here? How is this compiling?
trait Test extends FreeSpec with MustMatchers | ||
|
||
val person = TypeReference(cls[Person]) | ||
val name = person.andThen("name", cls[String]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh, here it is. Can we not do this and instead add an explicit import? It makes it harder to read the tests without a clear benefit if you ask me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the projections to the Person
companion object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more comments.
PS: in my view, the idea PR is about 400 lines. This is better since it is smaller, but still quite large.
def of(tpe: String, superClass: Class[_]): Projections = { | ||
|
||
def byType(p: Projection) = { | ||
def loop(p: Projection): Boolean = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add the @annotation.tailrec
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
false | ||
} | ||
|
||
def loop(p: Projection): Either[Projection, Option[Projection]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add @annotation.tailrec
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is not tail recursive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duh. yep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeName
in comment instead of tpe
?
Either.cond(!isSubclass(tpe), None, p) | ||
case p @ Property(path, name, tpe) => | ||
loop(path) match { | ||
case Left(path) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use Left(_) =>
here since we are not using path
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
loop(p) | ||
} | ||
|
||
def bySuperClass(p: Projection) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we put the return type? I think Option[Projection]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
p match { | ||
case TypeReference(tpe) => | ||
base match { | ||
case TypeReference(`tpe`) => Some(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some(p) == Some(base)
in this branch right? Can we write it that way so we can see in both branches we are returning Some(base)
or None
. In fact, maybe we should return Boolean instead so it is easier to follow and use filter below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loop
here does more than just filter the projection. It reconstructs the projections on top of the base projections using the type as the matching criteria. Take a look at line 99, where it uses option.map to recreate the property on top of the new base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. I missed that branch. But line 95 could be Some(base)
right (since we are checking equality)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it could. It doesn't seem to matter, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is easier to read that both branches are returning the same thing. I feel that is a value to someone understanding the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I've changed it
def apply(set: Set[Projection]) = { | ||
new Projections( | ||
set.filter { | ||
case Property(path, name, tpe) => !set.contains(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not clear to me this algorithm works. It seems the order you add things in matters here. It seems you need to add shortest paths first and just going through the set may cause you to add a path that later you will not want in there.
Do you follow my point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the property you want to maintain, how are we maintaining it through ++
? Can't you have two different projections in two different sets that need to have one filtered after a ++
? Can you spell out the invariant above the class in the comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by ordering since it filters an unordered set based on the same complete unordered set, but indeed this algorithm has a bug. It doesn't consider nested intersections. For instance, person.contact
and person.contact.phone.number
would return both projections, which is wrong. I've just fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The invariant holds for ++
since it also uses the apply
method. I'm not sure the invariant needs to be in the class comments since it's already in the apply method, but I don't feel strongly about it.
}) | ||
} | ||
|
||
def flatten(list: List[Projections]): Projections = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not Iterable
here since ++
is commutative right? We don't care about an particular order do we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
65982d7
to
0011bff
Compare
false | ||
} | ||
|
||
def loop(p: Projection): Either[Projection, Option[Projection]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duh. yep.
p match { | ||
case TypeReference(tpe) => | ||
base match { | ||
case TypeReference(`tpe`) => Some(p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right. I missed that branch. But line 95 could be Some(base)
right (since we are checking equality)?
val paramsProjecttions = params.flatMap(Projection.unapply) | ||
q""" | ||
$func match { | ||
case f: com.twitter.scalding.quotation.QuotedFunction => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to use _root_.com
here? I thought we needed to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to have it for hygiene. If the user has a com
object it'll fail, for instance. Note that even _root_
doesn't guarantee hygiene because that's a value that the user could define, but that should never happen :)
I'll add _root_
to all trees
|
||
val inputSymbols = inputs.map(_.symbol).toSet | ||
|
||
object Projection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this an object rather than just an def unapply(t: Tree): Option[Tree]
method? Can you comment?
} | ||
|
||
def functionCall(func: Tree, params: List[Tree]) = { | ||
val paramsProjecttions = params.flatMap(Projection.unapply) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Projections
is spelled wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
functionCall(func, params) | ||
case q"$func(..$params)" if isFunction(func) => | ||
functionCall(func, params) | ||
case t @ Projection(p) => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I guess you are using unapply here. Can we just comment on the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed it to ProjectionExtractor
. Is it clear enough to avoid the comment?
.exists(c.enclosingPosition.source.path.contains) | ||
|
||
def isScalding(sym: Symbol): Boolean = | ||
sym.fullName.contains("com.twitter.scalding") || { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want startsWith
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. Fixed
"non-quoted" in { | ||
val function = (p: Person) => p.name | ||
test.function[Person, String](function)._1 | ||
.projections.set mustEqual Set(Person.typeReference) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, does typeReference
mean we are keeping the whole thing (when we see that, we can't do any projections)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it must assume that all properties are used since the function isn't introspectable
val p = Projections(Set(p1)) ++ Projections(Set(p2)) | ||
p.set mustEqual Set(p1, p2) | ||
} | ||
"with merge" in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like we could do a scalacheck test here. We could test that for all List[Projections]
then if we ++
all of them, we have a final result where there are not Projections which are "suffixes" of another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of property-based testing. Is there something that would have better coverage if I use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHAT!?!? Who is not a big fan of property based checks!? Why don't you like it? I love it. It gives you an ability to clearly state properties and a decent approach at checking if what you claim is true actually is. Of course, with poor generators, it often doesn't hit tricky cases, but I have found it to be hugely helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the code is refactored into simple functions that don't require complex structures, unit tests are easier to implement and reason about. Property-based tests take longer to implement, are much harder to debug, and don't bring many benefits if compared to simple unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the reason I brought it up is that I thought I noticed a bug which I think a scalacheck would have caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, if I had thought about that scenario and checked the property :) The same is valid for unit tests. I can implement the property-based test if you feel strongly about it, though.
Projections.empty.toString mustEqual "Projections()" | ||
} | ||
"non-empty" in { | ||
Projections(Set(p1, p2)).toString mustEqual "Projections(T1.p1, T2.p2)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another scalacheck might be that Projections(someSet).toSet.size <= someSet.size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
7cc1994
to
c0d1897
Compare
* collected tree. | ||
*/ | ||
def collect[T](tree: Tree)(f: PartialFunction[Tree, T]): List[T] = { | ||
var res = List[T]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about using a List.newBuilder[T]
and using +=
on it which I think gives O(1) append (because they cheat and mutate the list). Currently I think this code is O(N^2)
since it is using append on a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
"nested transitive projection" in pendingUntilFixed { | ||
test.function[Person, Option[String]](_.alternativeContact.map(_.phone))._1.projections.set mustEqual | ||
Set(Person.typeReference.andThen(Accessor("alternativeContact"), typeName[Option[Contact]]).andThen(Accessor("phone"), typeName[String])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this confuses me. How do we distinguish an andThen
projection inside or outside an Option
? Seems like we need to record the input type we are dealing with also. We have Option[Contact]
as the output of the previous Projection, but how do we jump across the .map
on the Option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this is a failing test with pendingUntilFixed
. The projections in this scenario are Person.alternativeContact
and Contact.phone
, which are unrelated and thus don't get filtered out by Projections
. Ideally, the macro should apply a transformation similar to beta reduction to produce only the projection Person.alternativeContact.phone
.
c0d1897
to
d7463fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only real concern from me - Projections
sounds more like Map[ParamName, Set[Projections]]
and not like Set[Projection]
.
Feel free to address comments in subsequent reviews.
|
||
import scala.reflect.macros.blackbox.Context | ||
|
||
trait Liftables { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add comment on what Liftable
means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/** | ||
* A projection property (e.g. `Person.name`) | ||
*/ | ||
final case class Property(path: Projection, accessor: Accessor, typeName: TypeName) extends Projection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeName
is type of this Projection
? It sound like something which belongs to any Projection
, does it make sense to put it on the level of Projection
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, there isn't immediate benefit in abstracting it and the typeName
of a TypeReference
doesn't really have the same meaning as the typeName
of a Property
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point more about typeName
being part of any Projection
in some way (I guess it's type of Projection
). I definitely don't want to introduce it as an abstraction, more like a property which we have for any Projection
. If you can see it as something non consistent between implementations I'm ok with it.
false | ||
} | ||
|
||
def loop(p: Projection): Either[Projection, Option[Projection]] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typeName
in comment instead of tpe
?
* Returns the projections that are based on `tpe` and limits projections | ||
* to only properties that extend from `superClass`. | ||
*/ | ||
def of(typeName: TypeName, superClass: Class[_]): Projections = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of
sounds like method on companion object for me and I was reading this method this way. Maybe filterBySuperClass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filterBySuperClass
doesn't represent well what this method does. Would you have another suggestion? I find of
a good name to be honest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like of
, but I perceive it as something on companion objection like Projections.of(clazz)
.
*/ | ||
def of(typeName: TypeName, superClass: Class[_]): Projections = { | ||
|
||
def byType(p: Projection) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do rootProjection(Projection): TypeReference
and match on it? I think logic will be clearer then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
} | ||
|
||
Projections(set.filter(byType).flatMap(bySuperClass)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read this couple of times and have issues with understanding what bySuperClass
does. Can we add a comment on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And/or rename it and put on Projection
itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* returns the projection | ||
* `Person.name.contact` | ||
*/ | ||
def basedOn(base: Set[Projection]): Projections = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarifying for myself: we loose correspondence between function arguments and their Projection
s and because of that we do this match between all base
and all set
projections?
Can we rename loop(base, p)
and put it to Projection
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
val nestedList = | ||
params.flatMap { | ||
case param @ q"(..$inputs) => $body" => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kinda hard to read because you can't easily see alternatives together, can we extract bodies into separate functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@ttim I addressed your feedback on #1761
We could use beta reded uction to deal with projections based on the actual parameters. It doesn't seem necessary, though. It would make the code much more complex without clear benefit since the projections are aggregated at the end and it doesn't really matter from which parameter they came from. |
I don't think it complicates a logic anyhow significantly, but it makes part where you combine projections between function call and function body more straightforward and correct. |
The projection logic is based on type references right now. In order to do detect which projection is from which parameter, we'd need to introduce a logic similar to beta reduction. I don't see how it'd be more correct or precise than the current implementation. |
See #1754 for more details