Skip to content

Addition of syntactic sugar in Scala for MapReduce classes, Scala Implementations #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

moore-ryan
Copy link

Addition of syntactic sugar in Scala for MapReduce classes, as well as Scala implementations of all Java MapReduce applications.

In addition to the implementations, local integration tests were written
to verify the correctness of the original Java implementations
running in local mode, as well as ensuring that the Scala
implementations returned the same results.

…s Scala implementations of all Java MapReduce applications.

In addition to the implementations, local integration tests were written
to verify the correctness of the original Java implementations
running in local mode, as well as ensuring that the Scala
implementations returned the same results.

Squashed commit message history:
Initial commit of MapReduce syntactic sugar and Bigram implementations
Moved MapReduce-specific utilities into io.bespin.scala.mapreduce.util
Small cleanup and comments
Added more comments, renamed some traits for clarity
Updated syntax for readability and conciseness
Added compare script, changed WordCount to use new style
Added ports of CooccurrenceMatrix stripes/pairs. Still need to add more comments and make things a bit more idiomatic (the window calculations are not very 'scala-y' right now)
Added block comments and changed the sliding window code to be a bit more scala-idiomatic. Also added ported version of search code
Changed signature of TypedReducer's reduce method to use scala iterables and remove the need for explicit conversion from java
Added in implicit conversion coverage for pair datatypes
Beginning of unit tests; Beginning work on PageRank scala MR implementation - Basic page rank results match
Added in more integration tests; integration tests now pull required source files from the internet before running
Major refactoring of many of the traits in MapReduceSugar
  This should make the code a bit more modular and simple. It also solves
  some of the problems with the previous implementation having trouble
  with things such as running a simple partition job with no mapper or
  reducer being set.
  Addition of more integration tests and some unit tests for Hadoop<->Scala conversions
Removed nullMapper/nullReducer objects in favor of having Optional mappers and reducers
Main BFS classes implemented and results match.
Removed unused "compareJavaScala.py" file.
Created proper maven target for integration tests.
Added integration tests for BFS, refactored other integration tests
Added integration tests for non-schimmy versions of PageRank. Fixed bug in scala PageRank related to strange iterator behavior.
Added test fixture and integration tests for search/boolean retrieval
Full application parity between scala and java versions.
Fixed bug in RunPageRankSchimmy that caused failures in local mode of IMC due to mapper reuse.
  Added integration tests for PageRank and PageRankSchimmy.
@moore-ryan moore-ryan force-pushed the bespin-scala-enhancement-final branch from ca69eaf to 3ff8b37 Compare April 17, 2016 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant