Yay! I'm so excited you are interested in contributing to Sparkling ML!
The first thing you should start of with doing is subscribing to our mailing list, there isn't a lot of traffic yet but we can work on any questions you have their together. https://groups.google.com/forum/#!forum/sparklingml-dev
Once you've subscribed reach out about what kind of model/algorithm you want to bring into the fold.
If this is your first time getting started adding a new model with Spark's pipeline API there are some resources to get started with:
- (Blog post) O'Reilly Radar on extending Spark ML by Holden - https://www.oreilly.com/learning/extend-spark-ml-for-your-own-modeltransformer-types
- (Video) Spark Summit talk by Holden & Seth on extending Spark ML for custom models https://www.youtube.com/watch?v=gCfVVrgWgxY
The goal of the project is not to be home to a lot of complex model code, but rather to help bring existing ML tools into Spark's pipeline API while making them accesiable accross Python, Scala, and Java.