pyspark.pandas.DataFrame.applymap¶
-
DataFrame.
applymap
(func: Callable[[Any], Any]) → pyspark.pandas.frame.DataFrame[source]¶ Apply a function to a Dataframe elementwise.
This method applies a function that accepts and returns a scalar to every element of a DataFrame.
Note
this API executes the function once to infer the type which is potentially expensive, for instance, when the dataset is created after aggregations or sorting.
To avoid this, specify return type in
func
, for instance, as below:>>> def square(x) -> np.int32: ... return x ** 2
pandas-on-Spark uses return type hint and does not try to infer the type.
- Parameters
- funccallable
Python function, returns a single value from a single value.
- Returns
- DataFrame
Transformed DataFrame.
Examples
>>> df = ps.DataFrame([[1, 2.12], [3.356, 4.567]]) >>> df 0 1 0 1.000 2.120 1 3.356 4.567
>>> def str_len(x) -> int: ... return len(str(x)) >>> df.applymap(str_len) 0 1 0 3 4 1 5 5
>>> def power(x) -> float: ... return x ** 2 >>> df.applymap(power) 0 1 0 1.000000 4.494400 1 11.262736 20.857489
You can omit the type hint and let pandas-on-Spark infer its type.
>>> df.applymap(lambda x: x ** 2) 0 1 0 1.000000 4.494400 1 11.262736 20.857489