pyspark.mllib.util.
LinearDataGenerator
Utils for generating linear data.
New in version 1.5.0.
Methods
generateLinearInput(intercept, weights, …)
generateLinearInput
generateLinearRDD(sc, nexamples, nfeatures, eps)
generateLinearRDD
Generate an RDD of LabeledPoints.
Methods Documentation
bias factor, the term c in X’w + c
pyspark.mllib.linalg.Vector
feature vector, the term w in X’w + c
Point around which the data X is centered.
Variance of the given data
Number of points to be generated
Random Seed
Used to scale the noise. If eps is set high, the amount of gaussian noise added is more.
of pyspark.mllib.regression.LabeledPoints of length nPoints
pyspark.mllib.regression.LabeledPoints