Class StringIndexerAggregator

Object
org.apache.spark.sql.expressions.Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
org.apache.spark.ml.feature.StringIndexerAggregator
All Implemented Interfaces:
Serializable

public class StringIndexerAggregator extends Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
A SQL Aggregator used by StringIndexer to count labels in string columns during fitting.
See Also:
  • Constructor Details

    • StringIndexerAggregator

      public StringIndexerAggregator(int numColumns)
  • Method Details

    • bufferEncoder

      public Encoder<org.apache.spark.util.collection.OpenHashMap<String,Object>[]> bufferEncoder()
      Description copied from class: Aggregator
      Specifies the Encoder for the intermediate value type.
      Specified by:
      bufferEncoder in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Returns:
      (undocumented)
    • finish

      public org.apache.spark.util.collection.OpenHashMap<String,Object>[] finish(org.apache.spark.util.collection.OpenHashMap<String,Object>[] array)
      Description copied from class: Aggregator
      Transform the output of the reduction.
      Specified by:
      finish in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Parameters:
      array - (undocumented)
      Returns:
      (undocumented)
    • merge

      public org.apache.spark.util.collection.OpenHashMap<String,Object>[] merge(org.apache.spark.util.collection.OpenHashMap<String,Object>[] array1, org.apache.spark.util.collection.OpenHashMap<String,Object>[] array2)
      Description copied from class: Aggregator
      Merge two intermediate values.
      Specified by:
      merge in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Parameters:
      array1 - (undocumented)
      array2 - (undocumented)
      Returns:
      (undocumented)
    • outputEncoder

      public Encoder<org.apache.spark.util.collection.OpenHashMap<String,Object>[]> outputEncoder()
      Description copied from class: Aggregator
      Specifies the Encoder for the final output value type.
      Specified by:
      outputEncoder in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Returns:
      (undocumented)
    • reduce

      public org.apache.spark.util.collection.OpenHashMap<String,Object>[] reduce(org.apache.spark.util.collection.OpenHashMap<String,Object>[] array, Row row)
      Description copied from class: Aggregator
      Combine two values to produce a new value. For performance, the function may modify b and return it instead of constructing new object for b.
      Specified by:
      reduce in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Parameters:
      array - (undocumented)
      row - (undocumented)
      Returns:
      (undocumented)
    • zero

      public org.apache.spark.util.collection.OpenHashMap<String,Object>[] zero()
      Description copied from class: Aggregator
      A zero value for this aggregation. Should satisfy the property that any b + zero = b.
      Specified by:
      zero in class Aggregator<Row,org.apache.spark.util.collection.OpenHashMap<String,Object>[],org.apache.spark.util.collection.OpenHashMap<String,Object>[]>
      Returns:
      (undocumented)