pyspark.RDD.reduce¶
-
RDD.
reduce
(f)[source]¶ Reduces the elements of this RDD using the specified commutative and associative binary operator. Currently reduces partitions locally.
Examples
>>> from operator import add >>> sc.parallelize([1, 2, 3, 4, 5]).reduce(add) 15 >>> sc.parallelize((2 for _ in range(10))).map(lambda x: 1).cache().reduce(add) 10 >>> sc.parallelize([]).reduce(add) Traceback (most recent call last): ... ValueError: Can not reduce() empty RDD