pyspark.sql.functions.parse_url

pyspark.sql.functions.parse_url(url: ColumnOrName, partToExtract: ColumnOrName, key: Optional[ColumnOrName] = None) → pyspark.sql.column.Column[source]

Extracts a part from a URL.

New in version 3.5.0.

Parameters
urlColumn or str

A column of string.

partToExtractColumn or str

A column of string, the path.

keyColumn or str, optional

A column of string, the key.

Examples

>>> df = spark.createDataFrame(
...     [("http://spark.apache.org/path?query=1", "QUERY", "query",)],
...     ["a", "b", "c"]
... )
>>> df.select(parse_url(df.a, df.b, df.c).alias('r')).collect()
[Row(r='1')]
>>> df.select(parse_url(df.a, df.b).alias('r')).collect()
[Row(r='query=1')]