org.apache.spark.mllib.evaluation.RankingMetrics<T>

All Implemented Interfaces:: Serializable, org.apache.spark.internal.Logging

public class RankingMetrics<T> extends Object implements org.apache.spark.internal.Logging, Serializable

Evaluator for ranking algorithms.

Java users should use RankingMetrics$.of to create a RankingMetrics instance.

param: predictionAndLabels an RDD of (predicted ranking, ground truth set) pair or (predicted ranking, ground truth set, . relevance value of ground truth set). Since 3.4.0, it supports ndcg evaluation with relevance value.

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging
org.apache.spark.internal.Logging.LogStringContext, org.apache.spark.internal.Logging.SparkShellLoggingFilter
Constructor Summary

Constructors

Constructor

Description

RankingMetrics(RDD<? extends scala.Product> predictionAndLabels, scala.reflect.ClassTag<T> evidence$1)
Method Summary

Modifier and Type

Method

Description

double

meanAveragePrecision()

double

meanAveragePrecisionAt(int k)

Returns the mean average precision (MAP) at ranking position k of all the queries.

double

ndcgAt(int k)

Compute the average NDCG value of all the queries, truncated at ranking position k.

static <E, T extends Iterable<E>, A extends Iterable<Object>> RankingMetrics<E>

of(JavaRDD<? extends scala.Product> predictionAndLabels)

Creates a RankingMetrics instance (for Java users).

double

precisionAt(int k)

Compute the average precision of all the queries, truncated at ranking position k.

double

recallAt(int k)

Compute the average recall of all the queries, truncated at ranking position k.

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.spark.internal.Logging
initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, isTraceEnabled, log, logBasedOnLevel, logDebug, logDebug, logDebug, logDebug, logError, logError, logError, logError, logInfo, logInfo, logInfo, logInfo, logName, LogStringContext, logTrace, logTrace, logTrace, logTrace, logWarning, logWarning, logWarning, logWarning, MDC, org$apache$spark$internal$Logging$$log_, org$apache$spark$internal$Logging$$log__$eq, withLogContext

Constructor Details
- RankingMetrics
  
  public RankingMetrics(RDD<? extends scala.Product> predictionAndLabels, scala.reflect.ClassTag<T> evidence$1)
Method Details
- of
  
  public static <E, T extends Iterable<E>, A extends Iterable<Object>> RankingMetrics<E> of(JavaRDD<? extends scala.Product> predictionAndLabels)
  
  Creates a RankingMetrics instance (for Java users).
  
  Parameters:
  
  predictionAndLabels - a JavaRDD of (predicted ranking, ground truth set) pairs or (predicted ranking, ground truth set, relevance value of ground truth set). Since 3.4.0, it supports ndcg evaluation with relevance value.
  
  Returns:
  
  (undocumented)
- precisionAt
  
  public double precisionAt(int k)
  
  Compute the average precision of all the queries, truncated at ranking position k.
  If for a query, the ranking algorithm returns n (n is less than k) results, the precision value will be computed as #(relevant items retrieved) / k. This formula also applies when the size of the ground truth set is less than k.
  If a query has an empty ground truth set, zero will be used as precision together with a log warning.
  See the following paper for detail:
  IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
  
  Parameters:
  
  k - the position to compute the truncated precision, must be positive
  
  Returns:
  
  the average precision at the first k ranking positions
- meanAveragePrecision
  
  public double meanAveragePrecision()
- meanAveragePrecisionAt
  
  public double meanAveragePrecisionAt(int k)
  
  Returns the mean average precision (MAP) at ranking position k of all the queries. If a query has an empty ground truth set, the average precision will be zero and a log warning is generated.
  
  Parameters:
  
  k - the position to compute the truncated precision, must be positive
  
  Returns:
  
  the mean average precision at first k ranking positions
- ndcgAt
  
  public double ndcgAt(int k)
  
  Compute the average NDCG value of all the queries, truncated at ranking position k. The discounted cumulative gain at position k is computed as: sum,,i=1,,^k^ (2^{relevance of ''i''th item}^ - 1) / log(i + 1), and the NDCG is obtained by dividing the DCG value on the ground truth set. In the current implementation, the relevance value is binary if the relevance value is empty.
  If the relevance value is not empty but its size doesn't match the ground truth set size, a log warning is generated.
  If a query has an empty ground truth set, zero will be used as ndcg together with a log warning.
  See the following paper for detail:
  IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
  
  Parameters:
  
  k - the position to compute the truncated ndcg, must be positive
  
  Returns:
  
  the average ndcg at the first k ranking positions
- recallAt
  
  public double recallAt(int k)
  
  Compute the average recall of all the queries, truncated at ranking position k.
  If for a query, the ranking algorithm returns n results, the recall value will be computed as #(relevant items retrieved) / #(ground truth set). This formula also applies when the size of the ground truth set is less than k.
  If a query has an empty ground truth set, zero will be used as recall together with a log warning.
  See the following paper for detail:
  IR evaluation methods for retrieving highly relevant documents. K. Jarvelin and J. Kekalainen
  
  Parameters:
  
  k - the position to compute the truncated recall, must be positive
  
  Returns:
  
  the average recall at the first k ranking positions

Class RankingMetrics<T>

Nested Class Summary

Nested classes/interfaces inherited from interface org.apache.spark.internal.Logging

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.spark.internal.Logging

Constructor Details

RankingMetrics

Method Details

of

precisionAt

meanAveragePrecision

meanAveragePrecisionAt

ndcgAt

recallAt