Interface TreeEnsembleParams

All Superinterfaces:
DecisionTreeParams, HasCheckpointInterval, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasSeed, HasWeightCol, Identifiable, Params, PredictorParams, Serializable, scala.Serializable
All Known Subinterfaces:
GBTClassifierParams, GBTParams, GBTRegressorParams, RandomForestClassifierParams, RandomForestParams, RandomForestRegressorParams, TreeEnsembleClassifierParams, TreeEnsembleRegressorParams
All Known Implementing Classes:
GBTClassificationModel, GBTClassifier, GBTRegressionModel, GBTRegressor, RandomForestClassificationModel, RandomForestClassifier, RandomForestRegressionModel, RandomForestRegressor

public interface TreeEnsembleParams extends DecisionTreeParams
Parameters for Decision Tree-based ensemble algorithms.

Note: Marked as private since this may be made public in the future.

  • Method Details

    • subsamplingRate

      DoubleParam subsamplingRate()
      Fraction of the training data used for learning each decision tree, in range (0, 1]. (default = 1.0)
      Returns:
      (undocumented)
    • getSubsamplingRate

      double getSubsamplingRate()
    • getOldStrategy

      Strategy getOldStrategy(scala.collection.immutable.Map<Object,Object> categoricalFeatures, int numClasses, scala.Enumeration.Value oldAlgo, Impurity oldImpurity)
      Create a Strategy instance to use with the old API. NOTE: The caller should set impurity and seed.
      Parameters:
      categoricalFeatures - (undocumented)
      numClasses - (undocumented)
      oldAlgo - (undocumented)
      oldImpurity - (undocumented)
      Returns:
      (undocumented)
    • featureSubsetStrategy

      Param<String> featureSubsetStrategy()
      The number of features to consider for splits at each tree node. Supported options: - "auto": Choose automatically for task: If numTrees == 1, set to "all." If numTrees greater than 1 (forest), set to "sqrt" for classification and to "onethird" for regression. - "all": use all features - "onethird": use 1/3 of the features - "sqrt": use sqrt(number of features) - "log2": use log2(number of features) - "n": when n is in the range (0, 1.0], use n * number of features. When n is in the range (1, number of features), use n features. (default = "auto")

      These various settings are based on the following references: - log2: tested in Breiman (2001) - sqrt: recommended by Breiman manual for random forests - The defaults of sqrt (classification) and onethird (regression) match the R randomForest package.

      Returns:
      (undocumented)
      See Also:
    • getFeatureSubsetStrategy

      String getFeatureSubsetStrategy()