IsotonicRegression¶
- 
class pyspark.mllib.regression.IsotonicRegression[source]¶
- Isotonic regression. Currently implemented using parallelized pool adjacent violators algorithm. Only univariate (single feature) algorithm supported. - New in version 1.4.0. - Notes - Sequential PAV implementation based on Tibshirani, Ryan J., Holger Hoefling, and Robert Tibshirani (2011) [1] - Sequential PAV parallelization based on Kearsley, Anthony J., Richard A. Tapia, and Michael W. Trosset (1996) [2] - See also Isotonic regression (Wikipedia). - 1
- Tibshirani, Ryan J., Holger Hoefling, and Robert Tibshirani. “Nearly-isotonic regression.” Technometrics 53.1 (2011): 54-61. Available from http://www.stat.cmu.edu/~ryantibs/papers/neariso.pdf 
- 2
- Kearsley, Anthony J., Richard A. Tapia, and Michael W. Trosset “An approach to parallelizing isotonic regression.” Applied Mathematics and Parallel Computing. Physica-Verlag HD, 1996. 141-147. Available from http://softlib.rice.edu/pub/CRPC-TRs/reports/CRPC-TR96640.pdf 
 - Methods - train(data[, isotonic])- Train an isotonic regression model on the given data. - Methods Documentation - 
classmethod train(data: pyspark.rdd.RDD[VectorLike], isotonic: bool = True) → pyspark.mllib.regression.IsotonicRegressionModel[source]¶
- Train an isotonic regression model on the given data. - New in version 1.4.0. - Parameters
- datapyspark.RDD
- RDD of (label, feature, weight) tuples. 
- isotonicbool, optional
- Whether this is isotonic (which is default) or antitonic. (default: True) 
 
- data