pyspark.sql.DataFrame.take

DataFrame.take(num: int) → List[pyspark.sql.types.Row][source]

Returns the first num rows as a list of Row.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
numint

Number of records to return. Will return this number of records or all records if the DataFrame contains less than this number of records..

Returns
list

List of rows

Examples

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])

Return the first 2 rows of the DataFrame.

>>> df.take(2)
[Row(age=14, name='Tom'), Row(age=23, name='Alice')]