Class ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>>

Type Parameters:
FeaturesType - Type of input features. E.g., Vector
M - Concrete Model type
All Implemented Interfaces:
Serializable, org.apache.spark.internal.Logging, ClassifierParams, Params, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasRawPredictionCol, PredictorParams, Identifiable, scala.Serializable
Direct Known Subclasses:
LinearSVCModel, ProbabilisticClassificationModel

public abstract class ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> extends PredictionModel<FeaturesType,M> implements ClassifierParams
Model produced by a Classifier. Classes are indexed {0, 1, ..., numClasses - 1}.

See Also:
  • Constructor Details

    • ClassificationModel

      public ClassificationModel()
  • Method Details

    • numClasses

      public abstract int numClasses()
      Number of classes (values which the label can take).
    • predict

      public double predict(FeaturesType features)
      Predict label for the given features. This method is used to implement transform() and output PredictionModel.predictionCol().

      This default implementation for classification predicts the index of the maximum value from predictRaw().

      Specified by:
      predict in class PredictionModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>>
      features - (undocumented)
    • predictRaw

      public abstract Vector predictRaw(FeaturesType features)
      Raw prediction for each possible label. The meaning of a "raw" prediction may vary between algorithms, but it intuitively gives a measure of confidence in each possible label (where larger = more confident). This internal method is used to implement transform() and output rawPredictionCol().

      features - (undocumented)
      vector where element i is the raw prediction for label i. This raw prediction may be any real number, where a larger value indicates greater confidence for that label.
    • rawPredictionCol

      public final Param<String> rawPredictionCol()
      Description copied from interface: HasRawPredictionCol
      Param for raw prediction (a.k.a. confidence) column name.
      Specified by:
      rawPredictionCol in interface HasRawPredictionCol
    • setRawPredictionCol

      public M setRawPredictionCol(String value)
    • transform

      public Dataset<Row> transform(Dataset<?> dataset)
      Transforms dataset by reading from PredictionModel.featuresCol(), and appending new columns as specified by parameters: - predicted labels as PredictionModel.predictionCol() of type Double - raw predictions (confidences) as rawPredictionCol() of type Vector.

      transform in class PredictionModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>>
      dataset - input dataset
      transformed dataset
    • transformImpl

      public final Dataset<Row> transformImpl(Dataset<?> dataset)
    • transformSchema

      public StructType transformSchema(StructType schema)
      Description copied from class: PipelineStage
      Check transform validity and derive the output schema from the input schema.

      We check validity for interactions between parameters during transformSchema and raise an exception if any parameter value is invalid. Parameter value checks which do not depend on other parameters are handled by Param.validate().

      Typical implementation should first conduct verification on schema change and parameter validity, including complex parameter interaction checks.

      transformSchema in class PredictionModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>>
      schema - (undocumented)