Class KMeansModel

Object
org.apache.spark.mllib.clustering.KMeansModel
All Implemented Interfaces:
Serializable, PMMLExportable, Saveable, scala.Serializable
Direct Known Subclasses:
StreamingKMeansModel

public class KMeansModel extends Object implements Saveable, scala.Serializable, PMMLExportable
A clustering model for K-means. Each point belongs to the cluster with the closest center.
See Also:
  • Constructor Details

    • KMeansModel

      public KMeansModel(Vector[] clusterCenters, String distanceMeasure, double trainingCost, int numIter)
    • KMeansModel

      public KMeansModel(Vector[] clusterCenters)
    • KMeansModel

      public KMeansModel(Iterable<Vector> centers)
      A Java-friendly constructor that takes an Iterable of Vectors.
      Parameters:
      centers - (undocumented)
  • Method Details

    • load

      public static KMeansModel load(SparkContext sc, String path)
    • clusterCenters

      public Vector[] clusterCenters()
    • distanceMeasure

      public String distanceMeasure()
    • trainingCost

      public double trainingCost()
    • k

      public int k()
      Total number of clusters.
      Returns:
      (undocumented)
    • predict

      public int predict(Vector point)
      Returns the cluster index that a given point belongs to.
      Parameters:
      point - (undocumented)
      Returns:
      (undocumented)
    • predict

      public RDD<Object> predict(RDD<Vector> points)
      Maps given points to their cluster indices.
      Parameters:
      points - (undocumented)
      Returns:
      (undocumented)
    • predict

      public JavaRDD<Integer> predict(JavaRDD<Vector> points)
      Maps given points to their cluster indices.
      Parameters:
      points - (undocumented)
      Returns:
      (undocumented)
    • computeCost

      public double computeCost(RDD<Vector> data)
      Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
      Parameters:
      data - (undocumented)
      Returns:
      (undocumented)
    • save

      public void save(SparkContext sc, String path)
      Description copied from interface: Saveable
      Save this model to the given path.

      This saves: - human-readable (JSON) model metadata to path/metadata/ - Parquet formatted data to path/data/

      The model may be loaded using Loader.load.

      Specified by:
      save in interface Saveable
      Parameters:
      sc - Spark context used to save model data.
      path - Path specifying the directory in which to save this model. If the directory already exists, this method throws an exception.