pyspark.RDD.stats

RDD.stats() → pyspark.statcounter.StatCounter[source]

Return a StatCounter object that captures the mean, variance and count of the RDD’s elements in one operation.

New in version 0.9.1.

Returns
StatCounter

a StatCounter capturing the mean, variance and count of all elements