Class MetadataUtils

Object
org.apache.spark.ml.util.MetadataUtils

public class MetadataUtils extends Object
Helper utilities for algorithms using ML metadata
  • Constructor Details

    • MetadataUtils

      public MetadataUtils()
  • Method Details

    • getNumClasses

      public static scala.Option<Object> getNumClasses(StructField labelSchema)
      Examine a schema to identify the number of classes in a label column. Returns None if the number of labels is not specified, or if the label column is continuous.
      Parameters:
      labelSchema - (undocumented)
      Returns:
      (undocumented)
    • getNumFeatures

      public static scala.Option<Object> getNumFeatures(StructField vectorSchema)
      Examine a schema to identify the number of features in a vector column. Returns None if the number of features is not specified.
      Parameters:
      vectorSchema - (undocumented)
      Returns:
      (undocumented)
    • getCategoricalFeatures

      public static scala.collection.immutable.Map<Object,Object> getCategoricalFeatures(StructField featuresSchema)
      Examine a schema to identify categorical (Binary and Nominal) features.

      Parameters:
      featuresSchema - Schema of the features column. If a feature does not have metadata, it is assumed to be continuous. If a feature is Nominal, then it must have the number of values specified.
      Returns:
      Map: feature index to number of categories. The map's set of keys will be the set of categorical feature indices.
    • getFeatureIndicesFromNames

      public static int[] getFeatureIndicesFromNames(StructField col, String[] names)
      Takes a Vector column and a list of feature names, and returns the corresponding list of feature indices in the column, in order.
      Parameters:
      col - Vector column which must have feature names specified via attributes
      names - List of feature names
      Returns:
      (undocumented)