pyspark.sql.functions.regexp_count

pyspark.sql.functions.regexp_count(str: ColumnOrName, regexp: ColumnOrName) → pyspark.sql.column.Column[source]

Returns a count of the number of times that the Java regex pattern regexp is matched in the string str.

New in version 3.5.0.

Parameters
strColumn or str

target column to work on.

regexpColumn or str

regex pattern to apply.

Returns
Column

the number of times that a Java regex pattern is matched in the string.

Examples

>>> df = spark.createDataFrame([("1a 2b 14m", r"\d+")], ["str", "regexp"])
>>> df.select(regexp_count('str', lit(r'\d+')).alias('d')).collect()
[Row(d=3)]
>>> df.select(regexp_count('str', lit(r'mmm')).alias('d')).collect()
[Row(d=0)]
>>> df.select(regexp_count("str", col("regexp")).alias('d')).collect()
[Row(d=3)]