pyspark.sql.streaming.DataStreamWriter¶

class pyspark.sql.streaming.DataStreamWriter(df: DataFrame)[source]¶

Interface used to write a streaming DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.writeStream to access this.

New in version 2.0.0.

Notes

This API is evolving.

Methods

`foreach`(f)	Sets the output of the streaming query to be processed using the provided writer `f`.
`foreachBatch`(func)	Sets the output of the streaming query to be processed using the provided function.
`format`(source)	Specifies the underlying output data source.
`option`(key, value)	Adds an output option for the underlying data source.
`options`(**options)	Adds output options for the underlying data source.
`outputMode`(outputMode)	Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
`partitionBy`(*cols)	Partitions the output by the given columns on the file system.
`queryName`(queryName)	Specifies the name of the `StreamingQuery` that can be started with `start()`.
`start`([path, format, outputMode, …])	Streams the contents of the `DataFrame` to a data source.
`toTable`(tableName[, format, outputMode, …])	Starts the execution of the streaming query, which will continually output results to the given table as new data arrives.
`trigger`(*[, processingTime, once, …])	Set the trigger for the stream query.

pyspark.sql.streaming.DataStreamReader pyspark.sql.streaming.StreamingQuery