pyspark.SparkContext.range

SparkContext.range(start: int, end: Optional[int] = None, step: int = 1, numSlices: Optional[int] = None) → pyspark.rdd.RDD[int][source]

Create a new RDD of int containing elements from start to end (exclusive), increased by step every element. Can be called the same way as python’s built-in range() function. If called with a single argument, the argument is interpreted as end, and start is set to 0.

New in version 1.5.0.

Parameters
startint

the start value

endint, optional

the end value (exclusive)

stepint, optional, default 1

the incremental step

numSlicesint, optional

the number of partitions of the new RDD

Returns
RDD

An RDD of int

Examples

>>> sc.range(5).collect()
[0, 1, 2, 3, 4]
>>> sc.range(2, 4).collect()
[2, 3]
>>> sc.range(1, 7, 2).collect()
[1, 3, 5]

Generate RDD with a negative step

>>> sc.range(5, 0, -1).collect()
[5, 4, 3, 2, 1]
>>> sc.range(0, 5, -1).collect()
[]

Control the number of partitions

>>> sc.range(5, numSlices=1).getNumPartitions()
1
>>> sc.range(5, numSlices=10).getNumPartitions()
10