glm {SparkR} | R Documentation |
Fits a generalized linear model, similarly to R's glm(). Also see the glmnet package.
glm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...) ## S4 method for signature 'formula,ANY,DataFrame' glm(formula, family = c("gaussian", "binomial"), data, lambda = 0, alpha = 0, standardize = TRUE, solver = "auto")
formula |
A symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'. |
family |
Error distribution. "gaussian" -> linear regression, "binomial" -> logistic reg. |
data |
DataFrame for training |
lambda |
Regularization parameter |
alpha |
Elastic-net mixing parameter (see glmnet's documentation for details) |
standardize |
Whether to standardize features before training |
solver |
The solver algorithm used for optimization, this can be "l-bfgs", "normal" and "auto". "l-bfgs" denotes Limited-memory BFGS which is a limited-memory quasi-Newton optimization method. "normal" denotes using Normal Equation as an analytical solution to the linear regression problem. The default value is "auto" which means that the solver algorithm is selected automatically. |
a fitted MLlib model
## Not run:
##D sc <- sparkR.init()
##D sqlContext <- sparkRSQL.init(sc)
##D data(iris)
##D df <- createDataFrame(sqlContext, iris)
##D model <- glm(Sepal_Length ~ Sepal_Width, df, family="gaussian")
##D summary(model)
## End(Not run)