pyspark.sql.Column.substr

Column.substr(startPos: Union[int, Column], length: Union[int, Column]) → pyspark.sql.column.Column[source]

Return a Column which is a substring of the column.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
startPosColumn or int

start position

lengthColumn or int

length of the substring

Returns
Column

Column representing whether each element of Column is substr of origin Column.

Examples

>>> df = spark.createDataFrame(
...      [(2, "Alice"), (5, "Bob")], ["age", "name"])
>>> df.select(df.name.substr(1, 3).alias("col")).collect()
[Row(col='Ali'), Row(col='Bob')]