Package

com.swoop.spark

accumulators

Permalink

package accumulators

Visibility
  1. Public
  2. All

Type Members

  1. class ByKeyAdditiveAccumulator[A, B] extends AccumulatorV2[(A, B), Map[A, B]]

    Permalink

    An AccumulatorV2-style accumulator for collecting a map of sums.

    An AccumulatorV2-style accumulator for collecting a map of sums.

    This accumulator is used in com.swoop.spark.records.AccumulatorMetrics to collect metrics about the execution of record building.

    Through the use of the scala.math.Numeric typeclass, the implementation can be used with any numeric type as well as any other value type that is "additive" (the implementation only uses zero and plus).

    For example:

    - You could define plus to be the equivalent of add for another accumulator, e.g., LongAccumulator and then you can accumulate counts, sums and averages by key.

    - You could define plus on a set to be a union operation and then this accumulator will operate as a collect_set by key.

    - You could define plus on an array to be concatenation. Combined with a maxLength parameter, this would allow a fast "first N by key" action that completes through a single map stage. The normal way, through a transformation, would require a grouping operation and a shuffle.

    A

    Key type for the map to aggregate into.

    B

    Value type, supported by a scala.math.Numeric typeclass.

Ungrouped