|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Object org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm<LinearRegressionModel> org.apache.spark.mllib.regression.LinearRegressionWithSGD
public class LinearRegressionWithSGD
Train a linear regression model with no regularization using Stochastic Gradient Descent. This solves the least squares regression formulation f(weights) = 1/n ||A weights-y||^2^ (which is the mean squared error). Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with its corresponding right hand side label y. See also the documentation for the precise formulation.
Constructor Summary | |
---|---|
LinearRegressionWithSGD()
Construct a LinearRegression object with default parameters: {stepSize: 1.0, numIterations: 100, miniBatchFraction: 1.0}. |
Method Summary | |
---|---|
GradientDescent |
optimizer()
The optimizer to solve the problem. |
static LinearRegressionModel |
train(RDD<LabeledPoint> input,
int numIterations)
Train a LinearRegression model given an RDD of (label, features) pairs. |
static LinearRegressionModel |
train(RDD<LabeledPoint> input,
int numIterations,
double stepSize)
Train a LinearRegression model given an RDD of (label, features) pairs. |
static LinearRegressionModel |
train(RDD<LabeledPoint> input,
int numIterations,
double stepSize,
double miniBatchFraction)
Train a LinearRegression model given an RDD of (label, features) pairs. |
static LinearRegressionModel |
train(RDD<LabeledPoint> input,
int numIterations,
double stepSize,
double miniBatchFraction,
Vector initialWeights)
Train a Linear Regression model given an RDD of (label, features) pairs. |
Methods inherited from class org.apache.spark.mllib.regression.GeneralizedLinearAlgorithm |
---|
getNumFeatures, isAddIntercept, run, run, setIntercept, setValidateData |
Methods inherited from class Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface org.apache.spark.Logging |
---|
initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning |
Constructor Detail |
---|
public LinearRegressionWithSGD()
Method Detail |
---|
public static LinearRegressionModel train(RDD<LabeledPoint> input, int numIterations, double stepSize, double miniBatchFraction, Vector initialWeights)
miniBatchFraction
fraction of the data to calculate a stochastic gradient. The weights used
in gradient descent are initialized using the initial weights provided.
input
- RDD of (label, array of features) pairs. Each pair describes a row of the data
matrix A as well as the corresponding right hand side label ynumIterations
- Number of iterations of gradient descent to run.stepSize
- Step size to be used for each iteration of gradient descent.miniBatchFraction
- Fraction of data to be used per iteration.initialWeights
- Initial set of weights to be used. Array should be equal in size to
the number of features in the data.
public static LinearRegressionModel train(RDD<LabeledPoint> input, int numIterations, double stepSize, double miniBatchFraction)
miniBatchFraction
fraction of the data to calculate a stochastic gradient.
input
- RDD of (label, array of features) pairs. Each pair describes a row of the data
matrix A as well as the corresponding right hand side label ynumIterations
- Number of iterations of gradient descent to run.stepSize
- Step size to be used for each iteration of gradient descent.miniBatchFraction
- Fraction of data to be used per iteration.
public static LinearRegressionModel train(RDD<LabeledPoint> input, int numIterations, double stepSize)
input
- RDD of (label, array of features) pairs. Each pair describes a row of the data
matrix A as well as the corresponding right hand side label ystepSize
- Step size to be used for each iteration of Gradient Descent.numIterations
- Number of iterations of gradient descent to run.
public static LinearRegressionModel train(RDD<LabeledPoint> input, int numIterations)
input
- RDD of (label, array of features) pairs. Each pair describes a row of the data
matrix A as well as the corresponding right hand side label ynumIterations
- Number of iterations of gradient descent to run.
public GradientDescent optimizer()
GeneralizedLinearAlgorithm
optimizer
in class GeneralizedLinearAlgorithm<LinearRegressionModel>
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |