Skip to content
This repository has been archived by the owner on Jun 9, 2021. It is now read-only.

What is Pair Heavy Style?

tdunning edited this page Sep 14, 2010 · 2 revisions

The pair heavy style is embodied in the following lines from the word count example


    PTable<String, Integer> wc = words.map(new DoFn<String, Pair<String, Integer>>() {
      @Override
      public void process(String x, EmitFn<Pair<String, Integer>> emitter) {
        emitter.emit(Pair.create(x, 1));
      }
    }, tableOf(strings(), integers()))

Here, we see that the DoFn has arguments String and Pair<String, Integer> and there is second argument to the
map method of p.tableOf(String.class, Integer.class). The second argument is necessary to allow the result
type of the map method to be different according to what kind of hint is provided.

Without the pair heavy style, we need four versions of DoFn with 2, 3 or 4 types and we need two kinds of EmitFn
with one or two types. We gain slightly be not requiring the types used in the second argument. Thus, the pair
heavy style saves us having a net of 4 classes and also decreases the number of cognitively very similar classes.

There will also be a run-time cost to put keys and values into Pair objects whenever we are consuming or producing
a table. My guess is that since these objects are so transient, they may not even represent a measurable overhead.

Clone this wiki locally