DataFrame.
filter
Filters rows using the given condition.
where() is an alias for filter().
where()
filter()
New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
a Column of types.BooleanType or a string of SQL expressions.
types.BooleanType
DataFrame
Filtered DataFrame.
Examples
>>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
Filter by Column instances.
>>> df.filter(df.age > 3).show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+ >>> df.where(df.age == 2).show() +---+-----+ |age| name| +---+-----+ | 2|Alice| +---+-----+
Filter by SQL expression in a string.
>>> df.filter("age > 3").show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+ >>> df.where("age = 2").show() +---+-----+ |age| name| +---+-----+ | 2|Alice| +---+-----+