pyspark.sql.functions.
zip_with
Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.
New in version 3.1.0.
Column
name of the first column or expression
name of the second column or expression
a binary function (x1: Column, x2: Column) -> Column... Can use methods of Column, functions defined in pyspark.sql.functions and Scala UserDefinedFunctions. Python UserDefinedFunctions are not supported (SPARK-27052).
(x1: Column, x2: Column) -> Column...
pyspark.sql.functions
UserDefinedFunctions
Examples
>>> df = spark.createDataFrame([(1, [1, 3, 5, 8], [0, 2, 4, 6])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: x ** y).alias("powers")).show(truncate=False) +---------------------------+ |powers | +---------------------------+ |[1.0, 9.0, 625.0, 262144.0]| +---------------------------+
>>> df = spark.createDataFrame([(1, ["foo", "bar"], [1, 2, 3])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: concat_ws("_", x, y)).alias("xs_ys")).show() +-----------------+ | xs_ys| +-----------------+ |[foo_1, bar_2, 3]| +-----------------+