pyspark.RDD.union

RDD.union(other: pyspark.rdd.RDD[U]) → pyspark.rdd.RDD[Union[T, U]][source]

Return the union of this RDD and another one.

New in version 0.7.0.

Parameters
otherRDD

another RDD

Returns
RDD

the union of this RDD and another one

Examples

>>> rdd = sc.parallelize([1, 1, 2, 3])
>>> rdd.union(rdd).collect()
[1, 1, 2, 3, 1, 1, 2, 3]