pyspark.sql.functions.regexp_instr¶
-
pyspark.sql.functions.
regexp_instr
(str: ColumnOrName, regexp: ColumnOrName, idx: Union[int, pyspark.sql.column.Column, None] = None) → pyspark.sql.column.Column[source]¶ Extract all strings in the str that match the Java regex regexp and corresponding to the regex group index.
New in version 3.5.0.
- Parameters
- Returns
Column
all strings in the str that match a Java regex and corresponding to the regex group index.
Examples
>>> df = spark.createDataFrame([("1a 2b 14m", r"\d+(a|b|m)")], ["str", "regexp"]) >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)')).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)'), 1).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', lit(r'\d+(a|b|m)'), 2).alias('d')).collect() [Row(d=1)] >>> df.select(regexp_instr('str', col("regexp")).alias('d')).collect() [Row(d=1)]