|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Object org.apache.spark.mllib.clustering.GaussianMixture
public class GaussianMixture
:: Experimental ::
This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). A GMM represents a composite distribution of independent Gaussian distributions with associated "mixing" weights specifying each's contribution to the composite.
Given a set of sample points, this class will maximize the log-likelihood for a mixture of k Gaussians, iterating until the log-likelihood changes by less than convergenceTol, or until it has reached the max number of iterations. While this process is generally guaranteed to converge, it is not guaranteed to find a global optimum.
Note: For high-dimensional data (with many features), this algorithm may perform poorly. This is due to high-dimensional data (a) making it difficult to cluster at all (based on statistical/theoretical arguments) and (b) numerical issues with Gaussian distributions.
param: k The number of independent Gaussians in the mixture model param: convergenceTol The maximum change in log-likelihood at which convergence is considered to have occurred. param: maxIterations The maximum number of iterations to perform
Constructor Summary | |
---|---|
GaussianMixture()
Constructs a default instance. |
Method Summary | |
---|---|
double |
getConvergenceTol()
Return the largest change in log-likelihood at which convergence is considered to have occurred. |
scala.Option<GaussianMixtureModel> |
getInitialModel()
Return the user supplied initial GMM, if supplied |
int |
getK()
Return the number of Gaussians in the mixture model |
int |
getMaxIterations()
Return the maximum number of iterations to run |
long |
getSeed()
Return the random seed |
GaussianMixtureModel |
run(JavaRDD<Vector> data)
Java-friendly version of run() |
GaussianMixtureModel |
run(RDD<Vector> data)
Perform expectation maximization |
GaussianMixture |
setConvergenceTol(double convergenceTol)
Set the largest change in log-likelihood at which convergence is considered to have occurred. |
GaussianMixture |
setInitialModel(GaussianMixtureModel model)
Set the initial GMM starting point, bypassing the random initialization. |
GaussianMixture |
setK(int k)
Set the number of Gaussians in the mixture model. |
GaussianMixture |
setMaxIterations(int maxIterations)
Set the maximum number of iterations to run. |
GaussianMixture |
setSeed(long seed)
Set the random seed |
Methods inherited from class Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public GaussianMixture()
Method Detail |
---|
public GaussianMixture setInitialModel(GaussianMixtureModel model)
model
- (undocumented)
public scala.Option<GaussianMixtureModel> getInitialModel()
public GaussianMixture setK(int k)
public int getK()
public GaussianMixture setMaxIterations(int maxIterations)
public int getMaxIterations()
public GaussianMixture setConvergenceTol(double convergenceTol)
convergenceTol
- (undocumented)
public double getConvergenceTol()
public GaussianMixture setSeed(long seed)
public long getSeed()
public GaussianMixtureModel run(RDD<Vector> data)
public GaussianMixtureModel run(JavaRDD<Vector> data)
run()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |