pyspark.RDD.sampleStdev¶

RDD.sampleStdev() → float[source]¶

Compute the sample standard deviation of this RDD’s elements (which corrects for bias in estimating the standard deviation by dividing by N-1 instead of N).

New in version 0.9.1.

Returns

float: the sample standard deviation of all elements

See also

RDD.stats()
RDD.stdev()
RDD.variance()
RDD.sampleVariance()

Examples

>>> sc.parallelize([1, 2, 3]).sampleStdev()
1.0

pyspark.RDD.sampleByKey

pyspark.RDD.sampleVariance