Skip to content

hubte1g/spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A.apply() is the same as A() which is the same as A{}. 

DataFrame API is untyped Dataset.

From a category theory point of view you would say it "forms" a monad if the necessary functions exist. From a class/object point of view you would say it "is" a monad if it has the correct functions as part of it's construction. From a scala/cats perspective you'd say it "has" a monad if an instance of Monad exists that defines those function for the type.

Accumulator vs. count

rdd Dataframe — Row of generic untyped jam objects, with named col’s Dataset — strongly typed, with case class5

UDF, vectorization Persist v. broadcast

Trait (no constructor parameters) v. Abstract class Object v. Class Class v. Abstract class

Review: complexTypeExtractors.scala jsonExpressions.scala — StructsToJson Functions.scala, generators.scala

— collection accumulator for error-handling whenThenCoalesce+MatchCase

-custom explode, join via uff

Spark PR's https://spark-prs.appspot.com/open-prs -- participate / get involved with code reviews.

http://spark.apache.org/community.html

https://stackoverflow.com/questions/tagged/apache-spark


https://towardsdatascience.com/recognising-handwritten-digits-with-scala-9829d035f7bc

https://www.mapr.com/blog/spark-streaming-and-twitter-sentiment-analysis % spark

From quick-start to sci-ki-learn: https://databricks.com/blog/2016/05/18/spark-mllib-from-quick-start-to-scikit-learn.html

http://www.slideshare.net/jonathandinu/the-data-scientists-guide-to-apache-spark

Kafka notes: https://daggubati-tech.blogspot.com/2018/12/steps-to-follow-to-pull-data-from.html

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published