Class ANOVATest

Object
org.apache.spark.ml.stat.ANOVATest

public class ANOVATest extends Object
ANOVA Test for continuous data.

See Wikipedia for more information on ANOVA test.

  • Constructor Details

    • ANOVATest

      public ANOVATest()
  • Method Details

    • test

      public static Dataset<Row> test(Dataset<Row> dataset, String featuresCol, String labelCol)
      Parameters:
      dataset - DataFrame of categorical labels and continuous features.
      featuresCol - Name of features column in dataset, of type Vector (VectorUDT)
      labelCol - Name of label column in dataset, of any numerical type
      Returns:
      DataFrame containing the test result for every feature against the label. This DataFrame will contain a single Row with the following fields: - pValues: Vector - degreesOfFreedom: Array[Long] - fValues: Vector Each of these fields has one value per feature.
    • test

      public static Dataset<Row> test(Dataset<Row> dataset, String featuresCol, String labelCol, boolean flatten)
      Parameters:
      dataset - DataFrame of categorical labels and continuous features.
      featuresCol - Name of features column in dataset, of type Vector (VectorUDT)
      labelCol - Name of label column in dataset, of any numerical type
      flatten - If false, the returned DataFrame contains only a single Row, otherwise, one row per feature.
      Returns:
      (undocumented)