pyspark.sql.DataFrame.to_pandas_on_spark¶
-
DataFrame.
to_pandas_on_spark
(index_col=None)[source]¶ Converts the existing DataFrame into a pandas-on-Spark DataFrame.
If a pandas-on-Spark DataFrame is converted to a Spark DataFrame and then back to pandas-on-Spark, it will lose the index information and the original index will be turned into a normal column.
This is only available if Pandas is installed and available.
- Parameters
- index_col: str or list of str, optional, default: None
Index column of table in Spark.
See also
pyspark.pandas.frame.DataFrame.to_spark
Examples
>>> df.show() +----+----+ |Col1|Col2| +----+----+ | a| 1| | b| 2| | c| 3| +----+----+
>>> df.to_pandas_on_spark() Col1 Col2 0 a 1 1 b 2 2 c 3
We can specify the index columns.
>>> df.to_pandas_on_spark(index_col="Col1"): Col2 Col1 a 1 b 2 c 3