pyspark.streaming.DStream.partitionBy¶

DStream.partitionBy(numPartitions: int, partitionFunc: Callable[[K], int] = <function portable_hash>) → pyspark.streaming.dstream.DStream[Tuple[K, V]][source]¶: Return a copy of the DStream in which each RDD are partitioned using the specified partitioner.

pyspark.streaming.DStream.mapValues

pyspark.streaming.DStream.persist