pyspark.pandas.DataFrame.combine_first¶
-
DataFrame.
combine_first
(other: pyspark.pandas.frame.DataFrame) → pyspark.pandas.frame.DataFrame[source]¶ Update null elements with value in the same location in other.
Combine two DataFrame objects by filling null values in one DataFrame with non-null values from other DataFrame. The row and column indexes of the resulting DataFrame will be the union of the two.
New in version 3.3.0.
- Parameters
- otherDataFrame
Provided DataFrame to use to fill null values.
- Returns
- DataFrame
Examples
>>> ps.set_option("compute.ops_on_diff_frames", True) >>> df1 = ps.DataFrame({'A': [None, 0], 'B': [None, 4]}) >>> df2 = ps.DataFrame({'A': [1, 1], 'B': [3, 3]})
>>> df1.combine_first(df2).sort_index() A B 0 1.0 3.0 1 0.0 4.0
Null values persist if the location of that null value does not exist in other
>>> df1 = ps.DataFrame({'A': [None, 0], 'B': [4, None]}) >>> df2 = ps.DataFrame({'B': [3, 3], 'C': [1, 1]}, index=[1, 2])
>>> df1.combine_first(df2).sort_index() A B C 0 NaN 4.0 NaN 1 0.0 3.0 1.0 2 NaN 3.0 1.0 >>> ps.reset_option("compute.ops_on_diff_frames")