Interface KMeansParams

All Superinterfaces:
HasDistanceMeasure, HasFeaturesCol, HasMaxBlockSizeInMB, HasMaxIter, HasPredictionCol, HasSeed, HasSolver, HasTol, HasWeightCol, Identifiable, Params, Serializable, scala.Serializable
All Known Implementing Classes:
KMeans, KMeansModel

Common params for KMeans and KMeansModel
  • Method Details

    • getInitMode

      String getInitMode()
    • getInitSteps

      int getInitSteps()
    • getK

      int getK()
    • initMode

      Param<String> initMode()
      Param for the initialization algorithm. This can be either "random" to choose random points as initial cluster centers, or "k-means||" to use a parallel variant of k-means++ (Bahmani et al., Scalable K-Means++, VLDB 2012). Default: k-means||.
      Returns:
      (undocumented)
    • initSteps

      IntParam initSteps()
      Param for the number of steps for the k-means|| initialization mode. This is an advanced setting -- the default of 2 is almost always enough. Must be &gt; 0. Default: 2.
      Returns:
      (undocumented)
    • k

      The number of clusters to create (k). Must be &gt; 1. Note that it is possible for fewer than k clusters to be returned, for example, if there are fewer than k distinct points to cluster. Default: 2.
      Returns:
      (undocumented)
    • solver

      Param<String> solver()
      Param for the name of optimization method used in KMeans. Supported options: - "auto": Automatically select the solver based on the input schema and sparsity: If input instances are arrays or input vectors are dense, set to "block". Else, set to "row". - "row": input instances are processed row by row, and triangle-inequality is applied to accelerate the training. - "block": input instances are stacked to blocks, and GEMM is applied to compute the distances. Default is "auto".

      Specified by:
      solver in interface HasSolver
      Returns:
      (undocumented)
    • validateAndTransformSchema

      StructType validateAndTransformSchema(StructType schema)
      Validates and transforms the input schema.
      Parameters:
      schema - input schema
      Returns:
      output schema