public class BinaryClassificationMetrics
extends Object
implements org.apache.spark.internal.Logging
 param:  scoreAndLabels an RDD of (score, label) or (score, label, weight) tuples.
 param:  numBins if greater than 0, then the curves (ROC curve, PR curve) computed internally
                will be down-sampled to this many "bins". If 0, no down-sampling will occur.
                This is useful because the curve contains a point for each distinct score
                in the input, and this could be as large as the input itself -- millions of
                points or more, when thousands may be entirely sufficient to summarize
                the curve. After down-sampling, the curves will instead be made of approximately
                numBins points instead. Points are made from bins of equal numbers of
                consecutive points. The size of each bin is
                floor(scoreAndLabels.count() / numBins), which means the resulting number
                of bins may not exactly equal numBins. The last bin in each partition may
                be smaller as a result, meaning there may be an extra sample at
                partition boundaries.
| Constructor and Description | 
|---|
| BinaryClassificationMetrics(RDD<? extends scala.Product> scoreAndLabels,
                           int numBins) | 
| BinaryClassificationMetrics(RDD<scala.Tuple2<Object,Object>> scoreAndLabels)Defaults  numBinsto 0. | 
| Modifier and Type | Method and Description | 
|---|---|
| double | areaUnderPR()Computes the area under the precision-recall curve. | 
| double | areaUnderROC()Computes the area under the receiver operating characteristic (ROC) curve. | 
| RDD<scala.Tuple2<Object,Object>> | fMeasureByThreshold()Returns the (threshold, F-Measure) curve with beta = 1.0. | 
| RDD<scala.Tuple2<Object,Object>> | fMeasureByThreshold(double beta)Returns the (threshold, F-Measure) curve. | 
| int | numBins() | 
| RDD<scala.Tuple2<Object,Object>> | pr()Returns the precision-recall curve, which is an RDD of (recall, precision),
 NOT (precision, recall), with (0.0, p) prepended to it, where p is the precision
 associated with the lowest recall on the curve. | 
| RDD<scala.Tuple2<Object,Object>> | precisionByThreshold()Returns the (threshold, precision) curve. | 
| RDD<scala.Tuple2<Object,Object>> | recallByThreshold()Returns the (threshold, recall) curve. | 
| RDD<scala.Tuple2<Object,Object>> | roc()Returns the receiver operating characteristic (ROC) curve,
 which is an RDD of (false positive rate, true positive rate)
 with (0.0, 0.0) prepended and (1.0, 1.0) appended to it. | 
| RDD<? extends scala.Product> | scoreAndLabels() | 
| RDD<scala.Tuple2<Object,scala.Tuple2<Object,Object>>> | scoreLabelsWeight()Deprecated. 
 The variable `scoreLabelsWeight` should be private and will be removed in 4.0.0. Since 3.4.0. | 
| RDD<Object> | thresholds()Returns thresholds in descending order. | 
| void | unpersist()Unpersist intermediate RDDs used in the computation. | 
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitializepublic BinaryClassificationMetrics(RDD<? extends scala.Product> scoreAndLabels, int numBins)
public BinaryClassificationMetrics(RDD<scala.Tuple2<Object,Object>> scoreAndLabels)
numBins to 0.scoreAndLabels - (undocumented)public RDD<? extends scala.Product> scoreAndLabels()
public int numBins()
public RDD<scala.Tuple2<Object,scala.Tuple2<Object,Object>>> scoreLabelsWeight()
public void unpersist()
public RDD<Object> thresholds()
public RDD<scala.Tuple2<Object,Object>> roc()
public double areaUnderROC()
public RDD<scala.Tuple2<Object,Object>> pr()
public double areaUnderPR()
public RDD<scala.Tuple2<Object,Object>> fMeasureByThreshold(double beta)
beta - the beta factor in F-Measure computation.public RDD<scala.Tuple2<Object,Object>> fMeasureByThreshold()
public RDD<scala.Tuple2<Object,Object>> precisionByThreshold()
public RDD<scala.Tuple2<Object,Object>> recallByThreshold()