pyspark.mllib.stat.
KernelDensity
Estimate probability density at required points given an RDD of samples from the population.
Examples
>>> kd = KernelDensity() >>> sample = sc.parallelize([0.0, 1.0]) >>> kd.setSample(sample) >>> kd.estimate([0.0, 1.0]) array([ 0.12938758, 0.12938758])
Methods
estimate(points)
estimate
Estimate the probability density at points
setBandwidth(bandwidth)
setBandwidth
Set bandwidth of each sample.
setSample(sample)
setSample
Set sample points from the population.
Methods Documentation
Set bandwidth of each sample. Defaults to 1.0
Set sample points from the population. Should be a RDD