pyspark.ml.classification.
BinaryRandomForestClassificationTrainingSummary
BinaryRandomForestClassification training results for a given model.
New in version 3.1.0.
Methods
fMeasureByLabel([beta])
fMeasureByLabel
Returns f-measure for each label (category).
weightedFMeasure([beta])
weightedFMeasure
Returns weighted averaged f-measure.
Attributes
accuracy
Returns accuracy.
areaUnderROC
Computes the area under the receiver operating characteristic (ROC) curve.
fMeasureByThreshold
Returns a dataframe with two fields (threshold, F-Measure) curve with beta = 1.0.
falsePositiveRateByLabel
Returns false positive rate for each label (category).
labelCol
Field in “predictions” which gives the true label of each instance.
labels
Returns the sequence of labels in ascending order.
objectiveHistory
Objective function (scaled loss + regularization) at each iteration.
pr
Returns the precision-recall curve, which is a Dataframe containing two fields recall, precision with (0.0, 1.0) prepended to it.
precisionByLabel
Returns precision for each label (category).
precisionByThreshold
Returns a dataframe with two fields (threshold, precision) curve.
predictionCol
Field in “predictions” which gives the prediction of each class.
predictions
Dataframe outputted by the model’s transform method.
recallByLabel
Returns recall for each label (category).
recallByThreshold
Returns a dataframe with two fields (threshold, recall) curve.
roc
Returns the receiver operating characteristic (ROC) curve, which is a Dataframe having two fields (FPR, TPR) with (0.0, 0.0) prepended and (1.0, 1.0) appended to it.
scoreCol
Field in “predictions” which gives the probability or raw prediction of each class as a vector.
totalIterations
Number of training iterations until termination.
truePositiveRateByLabel
Returns true positive rate for each label (category).
weightCol
Field in “predictions” which gives the weight of each instance as a vector.
weightedFalsePositiveRate
Returns weighted false positive rate.
weightedPrecision
Returns weighted averaged precision.
weightedRecall
Returns weighted averaged recall.
weightedTruePositiveRate
Returns weighted true positive rate.
Methods Documentation
Attributes Documentation
Returns accuracy. (equals to the total number of correctly classified instances out of the total number of instances.)
Returns the sequence of labels in ascending order. This order matches the order used in metrics which are specified as arrays over labels, e.g., truePositiveRateByLabel.
Notes
In most cases, it will be values {0.0, 1.0, …, numClasses-1}, However, if the training set is missing a label, then all of the arrays over labels (e.g., from truePositiveRateByLabel) will be of length numClasses-1 instead of the expected numClasses.
Objective function (scaled loss + regularization) at each iteration. It contains one more element, the initial state, than number of iterations.
Returns a dataframe with two fields (threshold, precision) curve. Every possible probability obtained in transforming the dataset are used as thresholds used in calculating the precision.
Returns a dataframe with two fields (threshold, recall) curve. Every possible probability obtained in transforming the dataset are used as thresholds used in calculating the recall.
Wikipedia reference
Returns weighted averaged recall. (equals to precision, recall and f-measure)
Returns weighted true positive rate. (equals to precision, recall and f-measure)