Class OneHotEncoderCommon

Object
org.apache.spark.ml.feature.OneHotEncoderCommon

public class OneHotEncoderCommon extends Object
Provides some helper methods used by OneHotEncoder.
  • Constructor Details

    • OneHotEncoderCommon

      public OneHotEncoderCommon()
  • Method Details

    • transformOutputColumnSchema

      public static StructField transformOutputColumnSchema(StructField inputCol, String outputColName, boolean dropLast, boolean keepInvalid)
      Prepares the StructField with proper metadata for OneHotEncoder's output column.
      Parameters:
      inputCol - (undocumented)
      outputColName - (undocumented)
      dropLast - (undocumented)
      keepInvalid - (undocumented)
      Returns:
      (undocumented)
    • getOutputAttrGroupFromData

      public static scala.collection.Seq<AttributeGroup> getOutputAttrGroupFromData(Dataset<?> dataset, scala.collection.Seq<String> inputColNames, scala.collection.Seq<String> outputColNames, boolean dropLast)
      This method is called when we want to generate AttributeGroup from actual data for one-hot encoder.
      Parameters:
      dataset - (undocumented)
      inputColNames - (undocumented)
      outputColNames - (undocumented)
      dropLast - (undocumented)
      Returns:
      (undocumented)
    • createAttrGroupForAttrNames

      public static AttributeGroup createAttrGroupForAttrNames(String outputColName, int numAttrs, boolean dropLast, boolean keepInvalid)
      Creates an `AttributeGroup` with the required number of `BinaryAttribute`.