SVMWithSGD¶

class pyspark.mllib.classification.SVMWithSGD[source]¶

Train a Support Vector Machine (SVM) using Stochastic Gradient Descent.

New in version 0.9.0.

Methods

train(data[, iterations, step, regParam, …])

Train a support vector machine on the given data.

Methods Documentation

classmethod train(data: pyspark.rdd.RDD[pyspark.mllib.regression.LabeledPoint], iterations: int = 100, step: float = 1.0, regParam: float = 0.01, miniBatchFraction: float = 1.0, initialWeights: Optional[VectorLike] = None, regType: str = 'l2', intercept: bool = False, validateData: bool = True, convergenceTol: float = 0.001) → pyspark.mllib.classification.SVMModel [source]¶

Train a support vector machine on the given data.

New in version 0.9.0.

Parameters

datapyspark.RDD

The training data, an RDD of pyspark.mllib.regression.LabeledPoint.

iterationsint, optional

The number of iterations. (default: 100)

stepfloat, optional

The step parameter used in SGD. (default: 1.0)

regParamfloat, optional

The regularizer parameter. (default: 0.01)

miniBatchFractionfloat, optional

Fraction of data to be used for each SGD iteration. (default: 1.0)

initialWeightspyspark.mllib.linalg.Vector or convertible, optional

The initial weights. (default: None)

regTypestr, optional

The type of regularizer used for training our model. Allowed values:

“l1” for using L1 regularization
“l2” for using L2 regularization (default)
None for no regularization

interceptbool, optional

Boolean parameter which indicates the use or not of the augmented representation for training data (i.e. whether bias features are activated or not). (default: False)

validateDatabool, optional

Boolean parameter which indicates if the algorithm should validate data before training. (default: True)

convergenceTolfloat, optional

A condition which decides iteration termination. (default: 0.001)

SVMModel

NaiveBayesModel