pyspark.pandas.DataFrame.between_time¶

DataFrame.between_time(start_time: Union[datetime.time, str], end_time: Union[datetime.time, str], include_start: bool = True, include_end: bool = True, axis: Union[int, str] = 0) → pyspark.pandas.frame.DataFrame[source]¶

Select values between particular times of the day (example: 9:00-9:30 AM).

By setting start_time to be later than end_time, you can get the times that are not between the two times.

Parameters

start_timedatetime.time or str: Initial time as a time filter limit.
end_timedatetime.time or str: End time as a time filter limit.
include_startbool, default True: Whether the start time needs to be included in the result.
include_endbool, default True: Whether the end time needs to be included in the result.
axis{0 or ‘index’, 1 or ‘columns’}, default 0: Determine range time on index or columns value.

Returns

DataFrame: Data from the original object filtered to the specified dates range.

Raises

TypeError: If the index is not a DatetimeIndex

See also

at_time: Select values at a particular time of the day.
first: Select initial periods of time series based on a date offset.
last: Select final periods of time series based on a date offset.
DatetimeIndex.indexer_between_time: Get just the index locations for values between particular times of the day.

Examples

>>> idx = pd.date_range('2018-04-09', periods=4, freq='1D20min')
>>> psdf = ps.DataFrame({'A': [1, 2, 3, 4]}, index=idx)
>>> psdf
                     A
2018-04-09 00:00:00  1
2018-04-10 00:20:00  2
2018-04-11 00:40:00  3
2018-04-12 01:00:00  4

>>> psdf.between_time('0:15', '0:45')
                     A
2018-04-10 00:20:00  2
2018-04-11 00:40:00  3

You get the times that are not between two times by setting start_time later than end_time:

>>> psdf.between_time('0:45', '0:15')
                     A
2018-04-09 00:00:00  1
2018-04-12 01:00:00  4

pyspark.pandas.DataFrame.at_time

pyspark.pandas.DataFrame.drop