RDD.
countByKey
Count the number of elements for each key, and return the result to the master as a dictionary.
New in version 0.7.0.
a dictionary of (key, count) pairs
See also
RDD.collectAsMap()
RDD.countByValue()
Examples
>>> rdd = sc.parallelize([("a", 1), ("b", 1), ("a", 1)]) >>> sorted(rdd.countByKey().items()) [('a', 2), ('b', 1)]