-
-
Notifications
You must be signed in to change notification settings - Fork 355
RDDGenerator
Mahmoud Hanafy edited this page Apr 21, 2016
·
7 revisions
RDDGenerator
provides an easy way to generate arbitrary RDDs, to be able to check any property.
If you don't know scalacheck, I suggest you read about it first; to understand the concepts of properties and generators.
You can generate RDDs using method arbitraryRDD
, Which generates arbitrary RDDs of the desired type. Just create a generator for your required RDD type or use generators that are supported by default.
Example: (Use supported generator)
class RDDsCheck extends FunSuite with with SharedSparkContext with Checkers {
test("map should not change number of elements") {
val property =
forAll(RDDGenerator.genRDD[String](sc)(Arbitrary.arbitrary[String])) {
rdd => rdd.map(_.length).count() == rdd.count()
}
check(property)
}
}
Example: (Custom Generator)
class RDDsCheck extends FunSuite with SharedSparkContext with Checkers {
test("custom generator") {
val property =
forAll(RDDGenerator.genRDD[Person](sc) {
val generator: Gen[Person] = for {
name <- Arbitrary.arbitrary[String]
age <- Arbitrary.arbitrary[Int]
} yield (Person(name, age))
generator
}) {
rdd => rdd.map(_.age).count() == rdd.count()
}
check(property)
}
}
case class Person(name: String, age: Int)
You can specify the size of the RDDs using implicit PropertyCheckConfig
.
Example:
class RDDsCheck extends FunSuite with SharedSparkContext with Checkers {
test("generate rdd of specific size") {
implicit val generatorDrivenConfig =
PropertyCheckConfig(minSize = 10, maxSize = 20)
val prop = forAll(RDDGenerator.genRDD[String](sc)(Arbitrary.arbitrary[String])){
rdd => rdd.count() <= 20
}
check(prop)
}
}