pyspark.sql.functions.array_agg

pyspark.sql.functions.array_agg(col: ColumnOrName) → pyspark.sql.column.Column[source]

Aggregate function: returns a list of objects with duplicates.

New in version 3.5.0.

Parameters
colColumn or str

target column to compute on.

Returns
Column

list of objects with duplicates.

Examples

>>> df = spark.createDataFrame([[1],[1],[2]], ["c"])
>>> df.agg(array_agg('c').alias('r')).collect()
[Row(r=[1, 1, 2])]