pyspark.sql.DataFrame.limit

DataFrame.limit(num: int) → pyspark.sql.dataframe.DataFrame[source]

Limits the result count to the number specified.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
numint

Number of records to return. Will return this number of records or all records if the DataFrame contains less than this number of records.

Returns
DataFrame

Subset of the records

Examples

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
>>> df.limit(1).show()
+---+----+
|age|name|
+---+----+
| 14| Tom|
+---+----+
>>> df.limit(0).show()
+---+----+
|age|name|
+---+----+
+---+----+