Skip navigation links
$ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 

$

$colon$bslash(B, Function2<A, B, B>) - Static method in class org.apache.spark.sql.types.StructType
 
$colon$plus(B, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
$div$colon(B, Function2<B, A, B>) - Static method in class org.apache.spark.sql.types.StructType
 
$greater(A) - Static method in class org.apache.spark.sql.types.Decimal
 
$greater(A) - Static method in class org.apache.spark.storage.RDDInfo
 
$greater$eq(A) - Static method in class org.apache.spark.sql.types.Decimal
 
$greater$eq(A) - Static method in class org.apache.spark.storage.RDDInfo
 
$less(A) - Static method in class org.apache.spark.sql.types.Decimal
 
$less(A) - Static method in class org.apache.spark.storage.RDDInfo
 
$less$eq(A) - Static method in class org.apache.spark.sql.types.Decimal
 
$less$eq(A) - Static method in class org.apache.spark.storage.RDDInfo
 
$minus$greater(T) - Static method in class org.apache.spark.ml.param.DoubleParam
 
$minus$greater(T) - Static method in class org.apache.spark.ml.param.FloatParam
 
$plus$colon(B, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
$plus$eq(T) - Static method in class org.apache.spark.Accumulator
Deprecated.
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.api.r.RRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.graphx.VertexRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
$plus$plus(RDD<T>) - Static method in class org.apache.spark.rdd.UnionRDD
 
$plus$plus(GenTraversableOnce<B>, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
$plus$plus$colon(TraversableOnce<B>, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
$plus$plus$colon(Traversable<B>, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
$plus$plus$eq(R) - Static method in class org.apache.spark.Accumulator
Deprecated.
 

A

abortJob(JobContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
Aborts a job after the writes fail.
abortJob(JobContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
 
abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
Aborts a task after the writes have failed.
abortTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
 
abs(Column) - Static method in class org.apache.spark.sql.functions
Computes the absolute value.
abs() - Method in class org.apache.spark.sql.types.Decimal
 
absent() - Static method in class org.apache.spark.api.java.Optional
 
AbsoluteError - Class in org.apache.spark.mllib.tree.loss
:: DeveloperApi :: Class for absolute error loss calculation (for regression).
AbsoluteError() - Constructor for class org.apache.spark.mllib.tree.loss.AbsoluteError
 
accept(Parsers) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
accept(ES, Function1<ES, List<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
accept(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
acceptIf(Function1<Object, Object>, Function1<Object, String>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
acceptMatch(String, PartialFunction<Object, U>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
acceptSeq(ES, Function1<ES, Iterable<Object>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
acceptsType(DataType) - Method in class org.apache.spark.sql.types.ObjectType
 
accId() - Method in class org.apache.spark.CleanAccum
 
Accumulable<R,T> - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
Accumulable(R, AccumulableParam<R, T>) - Constructor for class org.apache.spark.Accumulable
Deprecated.
 
accumulable(T, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(T, String, AccumulableParam<T, R>) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(R, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulable(R, String, AccumulableParam<R, T>) - Method in class org.apache.spark.SparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulableCollection(R, Function1<R, Growable<T>>, ClassTag<R>) - Method in class org.apache.spark.SparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulableInfo - Class in org.apache.spark.scheduler
:: DeveloperApi :: Information about an Accumulable modified during a task or stage.
AccumulableInfo - Class in org.apache.spark.status.api.v1
 
accumulableInfoFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
accumulableInfoToJson(AccumulableInfo) - Static method in class org.apache.spark.util.JsonProtocol
 
AccumulableParam<R,T> - Interface in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulables() - Method in class org.apache.spark.scheduler.StageInfo
Terminal values of accumulables updated during this stage, including all the user-defined accumulators.
accumulables() - Method in class org.apache.spark.scheduler.TaskInfo
Intermediate updates to accumulables during this task.
accumulables() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
accumulablesToJson(Traversable<AccumulableInfo>) - Static method in class org.apache.spark.util.JsonProtocol
 
Accumulator<T> - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(int) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use sc().longAccumulator(). Since 2.0.0.
accumulator(int, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use sc().longAccumulator(String). Since 2.0.0.
accumulator(double) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use sc().doubleAccumulator(). Since 2.0.0.
accumulator(double, String) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use sc().doubleAccumulator(String). Since 2.0.0.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.api.java.JavaSparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulator(T, String, AccumulatorParam<T>) - Method in class org.apache.spark.SparkContext
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorContext - Class in org.apache.spark.util
An internal class used to track accumulators by Spark itself.
AccumulatorContext() - Constructor for class org.apache.spark.util.AccumulatorContext
 
AccumulatorParam<T> - Interface in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.DoubleAccumulatorParam$ - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.FloatAccumulatorParam$ - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.IntAccumulatorParam$ - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.LongAccumulatorParam$ - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
AccumulatorParam.StringAccumulatorParam$ - Class in org.apache.spark
Deprecated.
use AccumulatorV2. Since 2.0.0.
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.StageData
 
accumulatorUpdates() - Method in class org.apache.spark.status.api.v1.TaskData
 
AccumulatorV2<IN,OUT> - Class in org.apache.spark.util
The base class for accumulators, that can accumulate inputs of type IN, and produce output of type OUT.
AccumulatorV2() - Constructor for class org.apache.spark.util.AccumulatorV2
 
accumUpdates() - Method in class org.apache.spark.ExceptionFailure
 
accumUpdates() - Method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
accuracy() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns accuracy (equals to the total number of correctly classified instances out of the total number of instances.)
accuracy() - Method in class org.apache.spark.mllib.evaluation.MultilabelMetrics
Returns accuracy
acos(Column) - Static method in class org.apache.spark.sql.functions
Computes the cosine inverse of the given value; the returned angle is in the range 0.0 through pi.
acos(String) - Static method in class org.apache.spark.sql.functions
Computes the cosine inverse of the given column; the returned angle is in the range 0.0 through pi.
active() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
Returns a list of active queries associated with this SQLContext
active() - Method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
ACTIVE() - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
 
activeJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
Deprecated.
 
activeStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
Deprecated.
 
activeStorageStatusList() - Method in class org.apache.spark.ui.exec.ExecutorsListener
Deprecated.
 
activeStorageStatusList() - Method in class org.apache.spark.ui.storage.StorageListener
Deprecated.
 
activeTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
 
add(T) - Method in class org.apache.spark.Accumulable
Deprecated.
Add more data to this accumulator / accumulable
add(T) - Static method in class org.apache.spark.Accumulator
Deprecated.
 
add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.classification.LinearSVCAggregator
Add a new training instance to this LinearSVCAggregator, and update the loss and gradient of the objective function.
add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.classification.LogisticAggregator
Add a new training instance to this LogisticAggregator, and update the loss and gradient of the objective function.
add(Vector) - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
Add a new training instance to this ExpectationAggregator, update the weights, means and covariances for each distributions, and update the log likelihood.
add(AFTPoint) - Method in class org.apache.spark.ml.regression.AFTAggregator
Add a new training data to this AFTAggregator, and update the loss and gradient of the objective function.
add(org.apache.spark.ml.feature.Instance) - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
Add a new training instance to this LeastSquaresAggregator, and update the loss and gradient of the objective function.
add(double[], MultivariateGaussian[], ExpectationSum, Vector<Object>) - Static method in class org.apache.spark.mllib.clustering.ExpectationSum
 
add(Vector) - Method in class org.apache.spark.mllib.feature.IDF.DocumentFrequencyAggregator
Adds a new document.
add(BlockMatrix) - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Adds the given block matrix other to this block matrix: this + other.
add(Vector) - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Add a new sample to this summarizer, and update the statistical summary.
add(StructField) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field.
add(String, DataType) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new nullable field with no metadata.
add(String, DataType, boolean) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field with no metadata.
add(String, DataType, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field and specifying metadata.
add(String, DataType, boolean, String) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field and specifying metadata.
add(String, String) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new nullable field with no metadata where the dataType is specified as a String.
add(String, String, boolean) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field with no metadata where the dataType is specified as a String.
add(String, String, boolean, Metadata) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field and specifying metadata where the dataType is specified as a String.
add(String, String, boolean, String) - Method in class org.apache.spark.sql.types.StructType
Creates a new StructType by adding a new field and specifying metadata where the dataType is specified as a String.
add(long, long) - Static method in class org.apache.spark.streaming.util.RawTextHelper
 
add(IN) - Method in class org.apache.spark.util.AccumulatorV2
Takes the inputs and accumulates.
add(T) - Method in class org.apache.spark.util.CollectionAccumulator
 
add(Double) - Method in class org.apache.spark.util.DoubleAccumulator
Adds v to the accumulator, i.e.
add(double) - Method in class org.apache.spark.util.DoubleAccumulator
Adds v to the accumulator, i.e.
add(T) - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
 
add(Long) - Method in class org.apache.spark.util.LongAccumulator
Adds v to the accumulator, i.e.
add(long) - Method in class org.apache.spark.util.LongAccumulator
Adds v to the accumulator, i.e.
add(Object) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by one.
add(Object, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by count.
add_months(Column, int) - Static method in class org.apache.spark.sql.functions
Returns the date that is numMonths after startDate.
addAccumulator(R, T) - Method in interface org.apache.spark.AccumulableParam
Deprecated.
Add additional data to the accumulator value.
addAccumulator(T, T) - Method in interface org.apache.spark.AccumulatorParam
Deprecated.
 
addAppArgs(String...) - Method in class org.apache.spark.launcher.SparkLauncher
Adds command line arguments for the application.
addBinary(byte[]) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by one.
addBinary(byte[], long) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by count.
addFile(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.api.java.JavaSparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
Adds a file to be submitted with the application.
addFile(String) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
addFile(String, boolean) - Method in class org.apache.spark.SparkContext
Add a file to be downloaded with this Spark job on every node.
addFilters(Seq<ServletContextHandler>, SparkConf) - Static method in class org.apache.spark.ui.JettyUtils
Add filters, if any, to the given list of ServletContextHandlers
addGrid(Param<T>, Iterable<T>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a param with multiple values (overwrites if the input param exists).
addGrid(DoubleParam, double[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a double param with multiple values.
addGrid(IntParam, int[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds an int param with multiple values.
addGrid(FloatParam, float[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a float param with multiple values.
addGrid(LongParam, long[]) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a long param with multiple values.
addGrid(BooleanParam) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Adds a boolean param with true and false.
addInPlace(R, R) - Method in interface org.apache.spark.AccumulableParam
Deprecated.
Merge two accumulated values together.
addInPlace(double, double) - Method in class org.apache.spark.AccumulatorParam.DoubleAccumulatorParam$
Deprecated.
 
addInPlace(float, float) - Method in class org.apache.spark.AccumulatorParam.FloatAccumulatorParam$
Deprecated.
 
addInPlace(int, int) - Method in class org.apache.spark.AccumulatorParam.IntAccumulatorParam$
Deprecated.
 
addInPlace(long, long) - Method in class org.apache.spark.AccumulatorParam.LongAccumulatorParam$
Deprecated.
 
addInPlace(String, String) - Method in class org.apache.spark.AccumulatorParam.StringAccumulatorParam$
Deprecated.
 
addJar(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.launcher.SparkLauncher
Adds a jar file to be submitted with the application.
addJar(String) - Method in class org.apache.spark.SparkContext
Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
addJar(String) - Method in class org.apache.spark.sql.hive.HiveSessionResourceLoader
 
addListener(SparkAppHandle.Listener) - Method in interface org.apache.spark.launcher.SparkAppHandle
Adds a listener to be notified of changes to the handle's information.
addListener(StreamingQueryListener) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
Register a StreamingQueryListener to receive up-calls for life cycle events of StreamingQuery.
addLocalConfiguration(String, int, int, int, JobConf) - Static method in class org.apache.spark.rdd.HadoopRDD
Add Hadoop configuration specific to a single partition and attempt.
addLong(long) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by one.
addLong(long, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by count.
addPartToPGroup(Partition, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
 
addPyFile(String) - Method in class org.apache.spark.launcher.SparkLauncher
Adds a python file / zip / egg to be submitted with the application.
address() - Method in class org.apache.spark.status.api.v1.RDDDataDistribution
 
addShutdownHook(Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager
Adds a shutdown hook with default priority.
addShutdownHook(int, Function0<BoxedUnit>) - Static method in class org.apache.spark.util.ShutdownHookManager
Adds a shutdown hook with the given priority.
addSparkArg(String) - Method in class org.apache.spark.launcher.SparkLauncher
Adds a no-value argument to the Spark invocation.
addSparkArg(String, String) - Method in class org.apache.spark.launcher.SparkLauncher
Adds an argument with a value to the Spark invocation.
addSparkListener(SparkListenerInterface) - Method in class org.apache.spark.SparkContext
:: DeveloperApi :: Register a listener to receive up-calls from events that happen during execution.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addStreamingListener(StreamingListener) - Method in class org.apache.spark.streaming.StreamingContext
Add a StreamingListener object for receiving system events related to streaming.
addString(StringBuilder, String, String, String) - Static method in class org.apache.spark.sql.types.StructType
 
addString(StringBuilder, String) - Static method in class org.apache.spark.sql.types.StructType
 
addString(StringBuilder) - Static method in class org.apache.spark.sql.types.StructType
 
addString(String) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by one.
addString(String, long) - Method in class org.apache.spark.util.sketch.CountMinSketch
Increments item's count by count.
addSuppressed(Throwable) - Static method in exception org.apache.spark.sql.AnalysisException
 
addTaskCompletionListener(TaskCompletionListener) - Method in class org.apache.spark.TaskContext
Adds a (Java friendly) listener to be executed on task completion.
addTaskCompletionListener(Function1<TaskContext, BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a listener in the form of a Scala closure to be executed on task completion.
addTaskFailureListener(TaskFailureListener) - Method in class org.apache.spark.TaskContext
Adds a listener to be executed on task failure.
addTaskFailureListener(Function2<TaskContext, Throwable, BoxedUnit>) - Method in class org.apache.spark.TaskContext
Adds a listener to be executed on task failure.
AddWebUIFilter(String, Map<String, String>, String) - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter
 
AddWebUIFilter$() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.AddWebUIFilter$
 
AFTAggregator - Class in org.apache.spark.ml.regression
AFTAggregator computes the gradient and loss for a AFT loss function, as used in AFT survival regression for samples in sparse or dense vector in an online fashion.
AFTAggregator(Broadcast<DenseVector<Object>>, boolean, Broadcast<double[]>) - Constructor for class org.apache.spark.ml.regression.AFTAggregator
 
AFTCostFun - Class in org.apache.spark.ml.regression
AFTCostFun implements Breeze's DiffFunction[T] for AFT cost.
AFTCostFun(RDD<AFTPoint>, boolean, Broadcast<double[]>, int) - Constructor for class org.apache.spark.ml.regression.AFTCostFun
 
AFTSurvivalRegression - Class in org.apache.spark.ml.regression
:: Experimental :: Fit a parametric survival regression model named accelerated failure time (AFT) model (see Accelerated failure time model (Wikipedia)) based on the Weibull distribution of the survival time.
AFTSurvivalRegression(String) - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
 
AFTSurvivalRegression() - Constructor for class org.apache.spark.ml.regression.AFTSurvivalRegression
 
AFTSurvivalRegressionModel - Class in org.apache.spark.ml.regression
:: Experimental :: Model produced by AFTSurvivalRegression.
agg(Column, Column...) - Method in class org.apache.spark.sql.Dataset
Aggregates on the entire Dataset without groups.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.Dataset
(Scala-specific) Aggregates on the entire Dataset without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset
(Scala-specific) Aggregates on the entire Dataset without groups.
agg(Map<String, String>) - Method in class org.apache.spark.sql.Dataset
(Java-specific) Aggregates on the entire Dataset without groups.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.Dataset
Aggregates on the entire Dataset without groups.
agg(TypedColumn<V, U1>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
Computes the given aggregation, returning a Dataset of tuples for each unique key and the result of computing this aggregation over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(TypedColumn<V, U1>, TypedColumn<V, U2>, TypedColumn<V, U3>, TypedColumn<V, U4>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
Computes the given aggregations, returning a Dataset of tuples for each unique key and the result of computing these aggregations over all elements in the group.
agg(Column, Column...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
Compute aggregates by specifying a series of aggregate columns.
agg(Tuple2<String, String>, Seq<Tuple2<String, String>>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
(Scala-specific) Compute aggregates by specifying the column names and aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Map<String, String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
(Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
agg(Column, Seq<Column>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
Compute aggregates by specifying a series of aggregate columns.
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Static method in class org.apache.spark.api.java.JavaRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.api.r.RRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.VertexRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Aggregate the elements of each partition, and then the results for all the partitions, using given combine functions and a neutral "zero value".
aggregate(U, Function2<U, T, U>, Function2<U, U, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.UnionRDD
 
aggregate(Function0<B>, Function2<B, A, B>, Function2<B, B, B>) - Static method in class org.apache.spark.sql.types.StructType
 
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>) - Method in class org.apache.spark.api.java.JavaPairRDD
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Partitioner, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, int, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
aggregateByKey(U, Function2<U, V, U>, Function2<U, U, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Aggregate the values of each key, using given combine functions and a neutral "zero value".
AggregatedDialect - Class in org.apache.spark.sql.jdbc
AggregatedDialect can unify multiple dialects into one virtual Dialect.
AggregatedDialect(List<JdbcDialect>) - Constructor for class org.apache.spark.sql.jdbc.AggregatedDialect
 
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Method in class org.apache.spark.graphx.Graph
Aggregates values from the neighboring edges and vertices of each vertex.
aggregateMessages(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, ClassTag<A>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateMessages$default$3() - Static method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateMessagesWithActiveSet(Function1<EdgeContext<VD, ED, A>, BoxedUnit>, Function2<A, A, A>, TripletFields, Option<Tuple2<VertexRDD<?>, EdgeDirection>>, ClassTag<A>) - Method in class org.apache.spark.graphx.impl.GraphImpl
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
aggregateUsingIndex(RDD<Tuple2<Object, VD2>>, Function2<VD2, VD2, VD2>, ClassTag<VD2>) - Method in class org.apache.spark.graphx.VertexRDD
Aggregates vertices in messages that have the same ids using reduceFunc, returning a VertexRDD co-indexed with this.
AggregatingEdgeContext<VD,ED,A> - Class in org.apache.spark.graphx.impl
 
AggregatingEdgeContext(Function2<A, A, A>, Object, BitSet) - Constructor for class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
aggregationDepth() - Static method in class org.apache.spark.ml.classification.LinearSVC
 
aggregationDepth() - Static method in class org.apache.spark.ml.classification.LinearSVCModel
 
aggregationDepth() - Static method in class org.apache.spark.ml.classification.LogisticRegression
 
aggregationDepth() - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
aggregationDepth() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
 
aggregationDepth() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
 
aggregationDepth() - Static method in class org.apache.spark.ml.regression.LinearRegression
 
aggregationDepth() - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
 
Aggregator<K,V,C> - Class in org.apache.spark
:: DeveloperApi :: A set of functions used to aggregate data.
Aggregator(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Constructor for class org.apache.spark.Aggregator
 
aggregator() - Method in class org.apache.spark.ShuffleDependency
 
Aggregator<IN,BUF,OUT> - Class in org.apache.spark.sql.expressions
:: Experimental :: A base class for user-defined aggregations, which can be used in Dataset operations to take all of the elements of a group and reduce them to a single value.
Aggregator() - Constructor for class org.apache.spark.sql.expressions.Aggregator
 
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
 
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gamma$
 
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Gaussian$
 
aic(RDD<Tuple3<Object, Object, Object>>, double, double, double) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.Poisson$
 
aic() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionSummary
Akaike Information Criterion (AIC) for the fitted model.
Algo - Class in org.apache.spark.mllib.tree.configuration
Enum to select the algorithm for the decision tree
Algo() - Constructor for class org.apache.spark.mllib.tree.configuration.Algo
 
algo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
algo() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
 
algo() - Method in class org.apache.spark.mllib.tree.model.RandomForestModel
 
algorithm() - Method in class org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD
 
alias(String) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
alias(String) - Method in class org.apache.spark.sql.Dataset
Returns a new Dataset with an alias set.
alias(Symbol) - Method in class org.apache.spark.sql.Dataset
(Scala-specific) Returns a new Dataset with an alias set.
All - Static variable in class org.apache.spark.graphx.TripletFields
Expose all the fields (source, edge, and destination).
allAttributes() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
allAttributes() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
allAttributes() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
AllJobsCancelled - Class in org.apache.spark.scheduler
 
AllJobsCancelled() - Constructor for class org.apache.spark.scheduler.AllJobsCancelled
 
AllReceiverIds - Class in org.apache.spark.streaming.scheduler
A message used by ReceiverTracker to ask all receiver's ids still stored in ReceiverTrackerEndpoint.
AllReceiverIds() - Constructor for class org.apache.spark.streaming.scheduler.AllReceiverIds
 
allSources() - Static method in class org.apache.spark.metrics.source.StaticSources
The set of all static sources.
alpha() - Static method in class org.apache.spark.ml.recommendation.ALS
 
alpha() - Method in class org.apache.spark.mllib.random.WeibullGenerator
 
ALS - Class in org.apache.spark.ml.recommendation
Alternating Least Squares (ALS) matrix factorization.
ALS(String) - Constructor for class org.apache.spark.ml.recommendation.ALS
 
ALS() - Constructor for class org.apache.spark.ml.recommendation.ALS
 
ALS - Class in org.apache.spark.mllib.recommendation
Alternating Least Squares matrix factorization.
ALS() - Constructor for class org.apache.spark.mllib.recommendation.ALS
Constructs an ALS instance with default parameters: {numBlocks: -1, rank: 10, iterations: 10, lambda: 0.01, implicitPrefs: false, alpha: 1.0}.
ALS.InBlock$ - Class in org.apache.spark.ml.recommendation
 
ALS.Rating<ID> - Class in org.apache.spark.ml.recommendation
:: DeveloperApi :: Rating class for better code readability.
ALS.Rating$ - Class in org.apache.spark.ml.recommendation
 
ALS.RatingBlock$ - Class in org.apache.spark.ml.recommendation
 
ALSModel - Class in org.apache.spark.ml.recommendation
Model fitted by ALS.
am() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterClusterManager
 
AnalysisException - Exception in org.apache.spark.sql
Thrown when a query fails to analyze, usually because the query itself is invalid.
analyzed() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
analyzed() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
and(Column) - Method in class org.apache.spark.sql.Column
Boolean AND.
And - Class in org.apache.spark.sql.sources
A filter that evaluates to true iff both left or right evaluate to true.
And(Filter, Filter) - Constructor for class org.apache.spark.sql.sources.And
 
andThen(Function1<B, C>) - Static method in class org.apache.spark.sql.types.StructType
 
antecedent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
 
ANY() - Static method in class org.apache.spark.scheduler.TaskLocality
 
AnyDataType - Class in org.apache.spark.sql.types
An AbstractDataType that matches any concrete data types.
AnyDataType() - Constructor for class org.apache.spark.sql.types.AnyDataType
 
anyNull() - Method in interface org.apache.spark.sql.Row
Returns true if there are any NULL values in this row.
appAttemptId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
Append() - Static method in class org.apache.spark.sql.streaming.OutputMode
OutputMode in which only the new rows in the streaming DataFrame/Dataset will be written to the sink.
appendBias(Vector) - Static method in class org.apache.spark.mllib.util.MLUtils
Returns a new vector with 1.0 (bias) appended to the input vector.
appendColumn(StructType, String, DataType, boolean) - Static method in class org.apache.spark.ml.util.SchemaUtils
Appends a new column to the input schema.
appendColumn(StructType, StructField) - Static method in class org.apache.spark.ml.util.SchemaUtils
Appends a new column to the input schema.
appendReadColumns(Configuration, Seq<Integer>, Seq<String>) - Static method in class org.apache.spark.sql.hive.HiveShim
 
appHistoryInfoToPublicAppInfo(ApplicationHistoryInfo) - Static method in class org.apache.spark.status.api.v1.ApplicationsListResource
 
appId() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
APPLICATION_EXECUTOR_LIMIT() - Static method in class org.apache.spark.ui.ToolTips
 
applicationAttemptId() - Method in class org.apache.spark.SparkContext
 
ApplicationAttemptInfo - Class in org.apache.spark.status.api.v1
 
applicationEndFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationEndToJson(SparkListenerApplicationEnd) - Static method in class org.apache.spark.util.JsonProtocol
 
ApplicationEnvironmentInfo - Class in org.apache.spark.status.api.v1
 
applicationId() - Method in class org.apache.spark.SparkContext
A unique identifier for the Spark application.
ApplicationInfo - Class in org.apache.spark.status.api.v1
 
ApplicationsListResource - Class in org.apache.spark.status.api.v1
 
ApplicationsListResource() - Constructor for class org.apache.spark.status.api.v1.ApplicationsListResource
 
applicationStartFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
applicationStartToJson(SparkListenerApplicationStart) - Static method in class org.apache.spark.util.JsonProtocol
 
ApplicationStatus - Enum in org.apache.spark.status.api.v1
 
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.Graph
Construct a graph from a collection of vertices and edges with attributes.
apply(RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from edges, setting referenced vertices to defaultVertexAttr.
apply(RDD<Tuple2<Object, VD>>, RDD<Edge<ED>>, VD, StorageLevel, StorageLevel, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from vertices and edges, setting missing vertices to defaultVertexAttr.
apply(VertexRDD<VD>, EdgeRDD<ED>, ClassTag<VD>, ClassTag<ED>) - Static method in class org.apache.spark.graphx.impl.GraphImpl
Create a graph from a VertexRDD and an EdgeRDD with arbitrary replicated vertices.
apply(Graph<VD, ED>, A, int, EdgeDirection, Function3<Object, VD, A, VD>, Function1<EdgeTriplet<VD, ED>, Iterator<Tuple2<Object, A>>>, Function2<A, A, A>, ClassTag<VD>, ClassTag<ED>, ClassTag<A>) - Static method in class org.apache.spark.graphx.Pregel
Execute a Pregel-like iterative vertex-parallel abstraction.
apply(RDD<Tuple2<Object, VD>>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a standalone VertexRDD (one that is not set up for efficient joins with an EdgeRDD) from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(RDD<Tuple2<Object, VD>>, EdgeRDD<?>, VD, Function2<VD, VD, VD>, ClassTag<VD>) - Static method in class org.apache.spark.graphx.VertexRDD
Constructs a VertexRDD from an RDD of vertex-attribute pairs.
apply(DenseMatrix<Object>, DenseMatrix<Object>, Function1<Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
 
apply(DenseMatrix<Object>, DenseMatrix<Object>, DenseMatrix<Object>, Function2<Object, Object, Object>) - Static method in class org.apache.spark.ml.ann.ApplyInPlace
 
apply(String) - Method in class org.apache.spark.ml.attribute.AttributeGroup
Gets an attribute by its name.
apply(int) - Method in class org.apache.spark.ml.attribute.AttributeGroup
Gets an attribute by its index.
apply(int, int) - Method in class org.apache.spark.ml.linalg.DenseMatrix
 
apply(int) - Method in class org.apache.spark.ml.linalg.DenseVector
 
apply(int, int) - Method in interface org.apache.spark.ml.linalg.Matrix
Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.ml.linalg.SparseMatrix
 
apply(int) - Static method in class org.apache.spark.ml.linalg.SparseVector
 
apply(int) - Method in interface org.apache.spark.ml.linalg.Vector
Gets the value of the ith element.
apply(Param<T>) - Method in class org.apache.spark.ml.param.ParamMap
Gets the value of the input param or its default value if it does not exist.
apply(GeneralizedLinearRegressionBase) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression.FamilyAndLink$
Constructs the FamilyAndLink object from a parameter map
apply(Split) - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.SplitData$
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.FalsePositiveRate
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Precision
 
apply(BinaryConfusionMatrix) - Static method in class org.apache.spark.mllib.evaluation.binary.Recall
 
apply(int, int) - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
apply(int) - Method in class org.apache.spark.mllib.linalg.DenseVector
 
apply(int, int) - Method in interface org.apache.spark.mllib.linalg.Matrix
Gets the (i, j)-th element.
apply(int, int) - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
apply(int) - Static method in class org.apache.spark.mllib.linalg.SparseVector
 
apply(int) - Method in interface org.apache.spark.mllib.linalg.Vector
Gets the value of the ith element.
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
apply(int) - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
apply(int, Predict, double, boolean) - Static method in class org.apache.spark.mllib.tree.model.Node
Construct a node with nodeIndex, predict, impurity and isLeaf parameters.
apply(int) - Static method in class org.apache.spark.rdd.CheckpointState
 
apply(long, String, Option<String>, String, boolean) - Static method in class org.apache.spark.scheduler.AccumulableInfo
Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(long, String, Option<String>, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(long, String, String) - Static method in class org.apache.spark.scheduler.AccumulableInfo
Deprecated.
do not create AccumulableInfo. Since 2.0.0.
apply(String, long, Enumeration.Value, ByteBuffer) - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.StatusUpdate$
Alternate factory method that takes a ByteBuffer directly for the data field
apply(long, TaskMetrics) - Static method in class org.apache.spark.scheduler.RuntimePercentage
 
apply(int) - Static method in class org.apache.spark.scheduler.SchedulingMode
 
apply(int) - Static method in class org.apache.spark.scheduler.TaskLocality
 
apply(Object) - Method in class org.apache.spark.sql.Column
Extracts a value or values from a complex type.
apply(String) - Method in class org.apache.spark.sql.Dataset
Selects column based on the column name and return it as a Column.
apply(Column...) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
Creates a Column for this UDAF using given Columns as input arguments.
apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
Creates a Column for this UDAF using given Columns as input arguments.
apply(Seq<Column>) - Method in class org.apache.spark.sql.expressions.UserDefinedFunction
Returns an expression that invokes the UDF, using the given arguments.
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.DetermineTableStats
 
apply(int) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
apply(ScriptInputOutputSchema) - Static method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
apply(int) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
apply(int) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
apply(LogicalPlan) - Static method in class org.apache.spark.sql.hive.HiveAnalysis
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.RelationConversions
 
apply(LogicalPlan) - Method in class org.apache.spark.sql.hive.ResolveHiveSerdeTable
 
apply(Dataset<Row>, Seq<Expression>, RelationalGroupedDataset.GroupType) - Static method in class org.apache.spark.sql.RelationalGroupedDataset
 
apply(int) - Method in interface org.apache.spark.sql.Row
Returns the value at position i.
apply(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
Deprecated.
use Trigger.ProcessingTime(interval)
apply(Duration) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
Deprecated.
use Trigger.ProcessingTime(interval)
apply(DataType) - Static method in class org.apache.spark.sql.types.ArrayType
Construct a ArrayType object with the given element type.
apply(double) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(long) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigInteger) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigInt) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(BigDecimal, int, int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(String) - Static method in class org.apache.spark.sql.types.Decimal
 
apply(DataType, DataType) - Static method in class org.apache.spark.sql.types.MapType
Construct a MapType object with the given key type and value type.
apply(String) - Method in class org.apache.spark.sql.types.StructType
Extracts the StructField with the given name.
apply(Set<String>) - Method in class org.apache.spark.sql.types.StructType
Returns a StructType containing StructFields of the given names, preserving the original order of fields.
apply(int) - Method in class org.apache.spark.sql.types.StructType
 
apply(String) - Static method in class org.apache.spark.storage.BlockId
 
apply(String, String, int, Option<String>) - Static method in class org.apache.spark.storage.BlockManagerId
Returns a BlockManagerId for the given configuration.
apply(ObjectInput) - Static method in class org.apache.spark.storage.BlockManagerId
 
apply(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object.
apply(boolean, boolean, boolean, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object without setting useOffHeap.
apply(int, int) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Create a new StorageLevel object from its integer representation.
apply(ObjectInput) - Static method in class org.apache.spark.storage.StorageLevel
:: DeveloperApi :: Read StorageLevel object from ObjectInput stream.
apply(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
apply(Map<String, String>) - Method in class org.apache.spark.streaming.kafka.KafkaCluster.SimpleConsumerConfig$
Make a consumer config without requiring group.id or zookeeper.connect, since communicating with brokers also needs common settings such as timeout
apply(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
apply(long) - Static method in class org.apache.spark.streaming.Milliseconds
 
apply(long) - Static method in class org.apache.spark.streaming.Minutes
 
apply(int) - Static method in class org.apache.spark.streaming.scheduler.ReceiverState
 
apply(long) - Static method in class org.apache.spark.streaming.Seconds
 
apply(int) - Static method in class org.apache.spark.TaskState
 
apply(InputMetrics) - Method in class org.apache.spark.ui.jobs.UIData.InputMetricsUIData$
 
apply(OutputMetrics) - Method in class org.apache.spark.ui.jobs.UIData.OutputMetricsUIData$
 
apply(ShuffleReadMetrics) - Method in class org.apache.spark.ui.jobs.UIData.ShuffleReadMetricsUIData$
 
apply(ShuffleWriteMetrics) - Method in class org.apache.spark.ui.jobs.UIData.ShuffleWriteMetricsUIData$
 
apply(TaskInfo) - Method in class org.apache.spark.ui.jobs.UIData.TaskUIData$
 
apply(TraversableOnce<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values.
apply(Seq<Object>) - Static method in class org.apache.spark.util.StatCounter
Build a StatCounter from a list of values passed as variable-length arguments.
ApplyInPlace - Class in org.apache.spark.ml.ann
Implements in-place application of functions in the arrays
ApplyInPlace() - Constructor for class org.apache.spark.ml.ann.ApplyInPlace
 
applyOrElse(A1, Function1<A1, B1>) - Static method in class org.apache.spark.sql.types.StructType
 
applySchema(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
Use createDataFrame instead. Since 1.3.0.
applySchema(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
Use createDataFrame instead. Since 1.3.0.
appName() - Method in class org.apache.spark.api.java.JavaSparkContext
 
appName() - Method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
appName() - Method in class org.apache.spark.SparkContext
 
appName(String) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a name for the application, which will be shown in the Spark web UI.
approx_count_distinct(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(Column, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approx_count_distinct(String, double) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the approximate number of distinct items in a group.
approxCountDistinct(Column) - Static method in class org.apache.spark.sql.functions
Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(String) - Static method in class org.apache.spark.sql.functions
Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(Column, double) - Static method in class org.apache.spark.sql.functions
Deprecated.
Use approx_count_distinct. Since 2.1.0.
approxCountDistinct(String, double) - Static method in class org.apache.spark.sql.functions
Deprecated.
Use approx_count_distinct. Since 2.1.0.
ApproxHist() - Static method in class org.apache.spark.mllib.tree.configuration.QuantileStrategy
 
approxNearestNeighbors(Dataset<?>, Vector, int, String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
approxNearestNeighbors(Dataset<?>, Vector, int) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
approxNearestNeighbors(Dataset<?>, Vector, int, String) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
 
approxNearestNeighbors(Dataset<?>, Vector, int) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
 
approxQuantile(String, double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Calculates the approximate quantiles of a numerical column of a DataFrame.
approxQuantile(String[], double[], double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Calculates the approximate quantiles of numerical columns of a DataFrame.
approxSimilarityJoin(Dataset<?>, Dataset<?>, double, String) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
approxSimilarityJoin(Dataset<?>, Dataset<?>, double) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
approxSimilarityJoin(Dataset<?>, Dataset<?>, double, String) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
 
approxSimilarityJoin(Dataset<?>, Dataset<?>, double) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
 
AreaUnderCurve - Class in org.apache.spark.mllib.evaluation
Computes the area under the curve (AUC) using the trapezoidal rule.
AreaUnderCurve() - Constructor for class org.apache.spark.mllib.evaluation.AreaUnderCurve
 
areaUnderPR() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the precision-recall curve.
areaUnderROC() - Method in class org.apache.spark.ml.classification.BinaryLogisticRegressionSummary
Computes the area under the receiver operating characteristic (ROC) curve.
areaUnderROC() - Method in class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Computes the area under the receiver operating characteristic (ROC) curve.
argmax() - Method in class org.apache.spark.ml.linalg.DenseVector
 
argmax() - Method in class org.apache.spark.ml.linalg.SparseVector
 
argmax() - Method in interface org.apache.spark.ml.linalg.Vector
Find the index of a maximal element.
argmax() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
argmax() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
argmax() - Method in interface org.apache.spark.mllib.linalg.Vector
Find the index of a maximal element.
argString() - Method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
argString() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
argString() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
array(DataType) - Method in class org.apache.spark.sql.ColumnName
Creates a new StructField of type array.
array(Column...) - Static method in class org.apache.spark.sql.functions
Creates a new array column.
array(String, String...) - Static method in class org.apache.spark.sql.functions
Creates a new array column.
array(Seq<Column>) - Static method in class org.apache.spark.sql.functions
Creates a new array column.
array(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
Creates a new array column.
array_contains(Column, Object) - Static method in class org.apache.spark.sql.functions
Returns null if the array is null, true if the array contains value, and false otherwise.
arrayLengthGt(double) - Static method in class org.apache.spark.ml.param.ParamValidators
Check that the array length is greater than lowerBound.
ArrayType - Class in org.apache.spark.sql.types
 
ArrayType(DataType, boolean) - Constructor for class org.apache.spark.sql.types.ArrayType
 
as(Encoder<U>) - Method in class org.apache.spark.sql.Column
Provides a type hint about the expected return value of this column.
as(String) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(Seq<String>) - Method in class org.apache.spark.sql.Column
(Scala-specific) Assigns the given aliases to the results of a table generating function.
as(String[]) - Method in class org.apache.spark.sql.Column
Assigns the given aliases to the results of a table generating function.
as(Symbol) - Method in class org.apache.spark.sql.Column
Gives the column an alias.
as(String, Metadata) - Method in class org.apache.spark.sql.Column
Gives the column an alias with metadata.
as(Encoder<U>) - Method in class org.apache.spark.sql.Dataset
:: Experimental :: Returns a new Dataset where each record has been mapped on to the specified type.
as(String) - Method in class org.apache.spark.sql.Dataset
Returns a new Dataset with an alias set.
as(Symbol) - Method in class org.apache.spark.sql.Dataset
(Scala-specific) Returns a new Dataset with an alias set.
asBreeze() - Method in interface org.apache.spark.ml.linalg.Matrix
Converts to a breeze matrix.
asBreeze() - Method in interface org.apache.spark.ml.linalg.Vector
Converts the instance to a breeze vector.
asBreeze() - Method in interface org.apache.spark.mllib.linalg.Matrix
Converts to a breeze matrix.
asBreeze() - Method in interface org.apache.spark.mllib.linalg.Vector
Converts the instance to a breeze vector.
asc() - Method in class org.apache.spark.sql.Column
Returns an ascending ordering used in sorting.
asc(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on ascending order of the column.
asc_nulls_first() - Method in class org.apache.spark.sql.Column
Returns an ascending ordering used in sorting, where null values appear before non-null values.
asc_nulls_first(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on ascending order of the column, and null values return before non-null values.
asc_nulls_last() - Method in class org.apache.spark.sql.Column
Returns an ordering used in sorting, where null values appear after non-null values.
asc_nulls_last(String) - Static method in class org.apache.spark.sql.functions
Returns a sort expression based on ascending order of the column, and null values appear after non-null values.
ascii(Column) - Static method in class org.apache.spark.sql.functions
Computes the numeric value of the first character of the string column, and returns the result as an int column.
asCode() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
asCode() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
asCode() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
asin(Column) - Static method in class org.apache.spark.sql.functions
Computes the sine inverse of the given value; the returned angle is in the range -pi/2 through pi/2.
asin(String) - Static method in class org.apache.spark.sql.functions
Computes the sine inverse of the given column; the returned angle is in the range -pi/2 through pi/2.
asIterator() - Method in class org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator.
asJavaPairRDD() - Method in class org.apache.spark.api.r.PairwiseRRDD
 
asJavaRDD() - Method in class org.apache.spark.api.r.RRDD
 
asJavaRDD() - Method in class org.apache.spark.api.r.StringRRDD
 
asKeyValueIterator() - Method in class org.apache.spark.serializer.DeserializationStream
Read the elements of this stream through an iterator over key-value pairs.
AskPermissionToCommitOutput - Class in org.apache.spark.scheduler
 
AskPermissionToCommitOutput(int, int, int) - Constructor for class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
askRpcTimeout(SparkConf) - Static method in class org.apache.spark.util.RpcUtils
Returns the default Spark timeout to use for RPC ask operations.
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
askSlaves() - Method in class org.apache.spark.storage.BlockManagerMessages.GetMatchingBlockIds
 
asML() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
asML() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
asML() - Method in interface org.apache.spark.mllib.linalg.Matrix
Convert this matrix to the new mllib-local representation.
asML() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
asML() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
asML() - Method in interface org.apache.spark.mllib.linalg.Vector
Convert this vector to the new mllib-local representation.
asNullable() - Method in class org.apache.spark.sql.types.ObjectType
 
asRDDId() - Method in class org.apache.spark.storage.BlockId
 
asRDDId() - Static method in class org.apache.spark.storage.BroadcastBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.RDDBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.ShuffleBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.ShuffleDataBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.StreamBlockId
 
asRDDId() - Static method in class org.apache.spark.storage.TaskResultBlockId
 
assertNotSpilled(SparkContext, String, Function0<T>) - Static method in class org.apache.spark.TestUtils
Run some code involving jobs submitted to the given context and assert that the jobs did not spill.
assertSpilled(SparkContext, String, Function0<T>) - Static method in class org.apache.spark.TestUtils
Run some code involving jobs submitted to the given context and assert that the jobs spilled.
Assignment(long, int) - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
Assignment$() - Constructor for class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment$
 
assignments() - Method in class org.apache.spark.mllib.clustering.PowerIterationClusteringModel
 
AssociationRules - Class in org.apache.spark.ml.fpm
 
AssociationRules() - Constructor for class org.apache.spark.ml.fpm.AssociationRules
 
associationRules() - Method in class org.apache.spark.ml.fpm.FPGrowthModel
Get association rules fitted using the minConfidence.
AssociationRules - Class in org.apache.spark.mllib.fpm
Generates association rules from a RDD[FreqItemset[Item}.
AssociationRules() - Constructor for class org.apache.spark.mllib.fpm.AssociationRules
Constructs a default instance with default parameters {minConfidence = 0.8}.
AssociationRules.Rule<Item> - Class in org.apache.spark.mllib.fpm
An association rule between sets of items.
AsyncRDDActions<T> - Class in org.apache.spark.rdd
A set of asynchronous RDD actions available through an implicit conversion.
AsyncRDDActions(RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.rdd.AsyncRDDActions
 
atan(Column) - Static method in class org.apache.spark.sql.functions
Computes the tangent inverse of the given value.
atan(String) - Static method in class org.apache.spark.sql.functions
Computes the tangent inverse of the given column.
atan2(Column, Column) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(Column, String) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, Column) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, String) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(Column, double) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(String, double) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(double, Column) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
atan2(double, String) - Static method in class org.apache.spark.sql.functions
Returns the angle theta from the conversion of rectangular coordinates (x, y) to polar coordinates (r, theta).
attempt() - Method in class org.apache.spark.status.api.v1.TaskData
 
attemptId() - Method in class org.apache.spark.scheduler.StageInfo
 
attemptId() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
 
attemptId() - Method in class org.apache.spark.status.api.v1.StageData
 
attemptNumber() - Method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
attemptNumber() - Method in class org.apache.spark.scheduler.TaskInfo
 
attemptNumber() - Method in class org.apache.spark.TaskCommitDenied
 
attemptNumber() - Method in class org.apache.spark.TaskContext
How many times this task has been attempted.
attempts() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
 
attr() - Method in class org.apache.spark.graphx.Edge
 
attr() - Method in class org.apache.spark.graphx.EdgeContext
The attribute associated with the edge.
attr() - Method in class org.apache.spark.graphx.impl.AggregatingEdgeContext
 
Attribute - Class in org.apache.spark.ml.attribute
:: DeveloperApi :: Abstract class for ML attributes.
Attribute() - Constructor for class org.apache.spark.ml.attribute.Attribute
 
attribute() - Method in class org.apache.spark.sql.sources.EqualNullSafe
 
attribute() - Method in class org.apache.spark.sql.sources.EqualTo
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThan
 
attribute() - Method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.In
 
attribute() - Method in class org.apache.spark.sql.sources.IsNotNull
 
attribute() - Method in class org.apache.spark.sql.sources.IsNull
 
attribute() - Method in class org.apache.spark.sql.sources.LessThan
 
attribute() - Method in class org.apache.spark.sql.sources.LessThanOrEqual
 
attribute() - Method in class org.apache.spark.sql.sources.StringContains
 
attribute() - Method in class org.apache.spark.sql.sources.StringEndsWith
 
attribute() - Method in class org.apache.spark.sql.sources.StringStartsWith
 
AttributeGroup - Class in org.apache.spark.ml.attribute
:: DeveloperApi :: Attributes that describe a vector ML column.
AttributeGroup(String) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
Creates an attribute group without attribute info.
AttributeGroup(String, int) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
Creates an attribute group knowing only the number of attributes.
AttributeGroup(String, Attribute[]) - Constructor for class org.apache.spark.ml.attribute.AttributeGroup
Creates an attribute group with attributes.
AttributeKeys - Class in org.apache.spark.ml.attribute
Keys used to store attributes.
AttributeKeys() - Constructor for class org.apache.spark.ml.attribute.AttributeKeys
 
attributes() - Method in class org.apache.spark.ml.attribute.AttributeGroup
Optional array of attributes.
ATTRIBUTES() - Static method in class org.apache.spark.ml.attribute.AttributeKeys
 
AttributeType - Class in org.apache.spark.ml.attribute
:: DeveloperApi :: An enum-like type for attribute types: AttributeType$.Numeric, AttributeType$.Nominal, and AttributeType$.Binary.
AttributeType(String) - Constructor for class org.apache.spark.ml.attribute.AttributeType
 
attrType() - Method in class org.apache.spark.ml.attribute.Attribute
Attribute type.
attrType() - Method in class org.apache.spark.ml.attribute.BinaryAttribute
 
attrType() - Method in class org.apache.spark.ml.attribute.NominalAttribute
 
attrType() - Method in class org.apache.spark.ml.attribute.NumericAttribute
 
attrType() - Static method in class org.apache.spark.ml.attribute.UnresolvedAttribute
 
available() - Method in class org.apache.spark.io.LZ4BlockInputStream
 
available() - Method in class org.apache.spark.io.NioBufferedFileInputStream
 
available() - Method in class org.apache.spark.storage.BufferReleasingInputStream
 
Average() - Static method in class org.apache.spark.mllib.tree.configuration.EnsembleCombiningStrategy
 
avg(MapFunction<T, Double>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
Average aggregate function.
avg(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
Average aggregate function.
avg(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the average of the values in a group.
avg(String...) - Method in class org.apache.spark.sql.RelationalGroupedDataset
Compute the mean value for each numeric columns for each group.
avg(Seq<String>) - Method in class org.apache.spark.sql.RelationalGroupedDataset
Compute the mean value for each numeric columns for each group.
avg() - Method in class org.apache.spark.util.DoubleAccumulator
Returns the average of elements added to the accumulator.
avg() - Method in class org.apache.spark.util.LongAccumulator
Returns the average of elements added to the accumulator.
avgEventRate() - Method in class org.apache.spark.status.api.v1.streaming.ReceiverInfo
 
avgInputRate() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
 
avgMetrics() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
avgProcessingTime() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
 
avgSchedulingDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
 
avgTotalDelay() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
 
awaitAnyTermination() - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
Wait until any of the queries on the associated SQLContext has terminated since the creation of the context, or since resetTerminated() was called.
awaitAnyTermination(long) - Method in class org.apache.spark.sql.streaming.StreamingQueryManager
Wait until any of the queries on the associated SQLContext has terminated since the creation of the context, or since resetTerminated() was called.
awaitReady(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils
Preferred alternative to Await.ready().
awaitResult(Awaitable<T>, Duration) - Static method in class org.apache.spark.util.ThreadUtils
Preferred alternative to Await.result().
awaitTermination() - Method in interface org.apache.spark.sql.streaming.StreamingQuery
Waits for the termination of this query, either by query.stop() or by an exception.
awaitTermination(long) - Method in interface org.apache.spark.sql.streaming.StreamingQuery
Waits for the termination of this query, either by query.stop() or by an exception.
awaitTermination() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTermination() - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Wait for the execution to stop.
awaitTerminationOrTimeout(long) - Method in class org.apache.spark.streaming.StreamingContext
Wait for the execution to stop.
axpy(double, Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
y += a * x
axpy(double, Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y += a * x

B

BACKUP_STANDALONE_MASTER_PREFIX() - Static method in class org.apache.spark.util.Utils
An identifier that backup masters use in their responses.
balanceSlack() - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
 
base64(Column) - Static method in class org.apache.spark.sql.functions
Computes the BASE64 encoding of a binary column and returns it as a string column.
baseOn(ParamPair<?>...) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(ParamMap) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
baseOn(Seq<ParamPair<?>>) - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Sets the given parameters in this grid to fixed values.
BaseRelation - Class in org.apache.spark.sql.sources
Represents a collection of tuples with a known schema.
BaseRelation() - Constructor for class org.apache.spark.sql.sources.BaseRelation
 
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SparkSession
Convert a BaseRelation created for external data sources into a DataFrame.
baseRelationToDataFrame(BaseRelation) - Method in class org.apache.spark.sql.SQLContext
 
BaseRRDD<T,U> - Class in org.apache.spark.api.r
 
BaseRRDD(RDD<T>, int, byte[], String, String, byte[], Broadcast<Object>[], ClassTag<T>, ClassTag<U>) - Constructor for class org.apache.spark.api.r.BaseRRDD
 
BasicBlockReplicationPolicy - Class in org.apache.spark.storage
 
BasicBlockReplicationPolicy() - Constructor for class org.apache.spark.storage.BasicBlockReplicationPolicy
 
basicSparkPage(Function0<Seq<Node>>, String, boolean) - Static method in class org.apache.spark.ui.UIUtils
Returns a page with the spark css/js and a simple format.
batchDuration() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
 
batchDuration() - Method in class org.apache.spark.status.api.v1.streaming.StreamingStatistics
 
BATCHES() - Static method in class org.apache.spark.mllib.clustering.StreamingKMeans
 
batchId() - Method in class org.apache.spark.sql.streaming.StreamingQueryProgress
 
batchId() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
 
BatchInfo - Class in org.apache.spark.status.api.v1.streaming
 
BatchInfo - Class in org.apache.spark.streaming.scheduler
:: DeveloperApi :: Class having information on completed batches.
BatchInfo(Time, Map<Object, StreamInputInfo>, long, Option<Object>, Option<Object>, Map<Object, OutputOperationInfo>) - Constructor for class org.apache.spark.streaming.scheduler.BatchInfo
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
batchInfo() - Method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
batchInfos() - Method in class org.apache.spark.streaming.scheduler.StatsReportListener
 
BatchStatus - Enum in org.apache.spark.status.api.v1.streaming
 
batchTime() - Method in class org.apache.spark.status.api.v1.streaming.BatchInfo
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.BatchInfo
 
batchTime() - Method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
 
bean(Class<T>) - Static method in class org.apache.spark.sql.Encoders
Creates an encoder for Java Bean of type T.
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
 
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
 
beforeFetch(Connection, Map<String, String>) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
Override connection specific properties to run before a select is made.
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
 
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
 
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
 
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
 
beforeFetch(Connection, Map<String, String>) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
 
BernoulliCellSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials for partitioning a data sequence.
BernoulliCellSampler(double, double, boolean) - Constructor for class org.apache.spark.util.random.BernoulliCellSampler
 
BernoulliSampler<T> - Class in org.apache.spark.util.random
:: DeveloperApi :: A sampler based on Bernoulli trials.
BernoulliSampler(double, ClassTag<T>) - Constructor for class org.apache.spark.util.random.BernoulliSampler
 
bestModel() - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
bestModel() - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
 
beta() - Method in class org.apache.spark.mllib.random.WeibullGenerator
 
between(Object, Object) - Method in class org.apache.spark.sql.Column
True if the current column is between the lower bound and upper bound, inclusive.
bin(Column) - Static method in class org.apache.spark.sql.functions
An expression that returns the string representation of the binary value of the given long column.
bin(String) - Static method in class org.apache.spark.sql.functions
An expression that returns the string representation of the binary value of the given long column.
Binarizer - Class in org.apache.spark.ml.feature
Binarize a column of continuous features given a threshold.
Binarizer(String) - Constructor for class org.apache.spark.ml.feature.Binarizer
 
Binarizer() - Constructor for class org.apache.spark.ml.feature.Binarizer
 
Binary() - Static method in class org.apache.spark.ml.attribute.AttributeType
Binary type.
binary() - Static method in class org.apache.spark.ml.feature.CountVectorizer
 
binary() - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
 
binary() - Method in class org.apache.spark.ml.feature.HashingTF
Binary toggle to control term frequency counts.
binary() - Method in class org.apache.spark.sql.ColumnName
Creates a new StructField of type binary.
BINARY() - Static method in class org.apache.spark.sql.Encoders
An encoder for arrays of bytes.
BinaryAttribute - Class in org.apache.spark.ml.attribute
:: DeveloperApi :: A binary attribute.
BinaryClassificationEvaluator - Class in org.apache.spark.ml.evaluation
:: Experimental :: Evaluator for binary classification, which expects two input columns: rawPrediction and label.
BinaryClassificationEvaluator(String) - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
BinaryClassificationEvaluator() - Constructor for class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
BinaryClassificationMetrics - Class in org.apache.spark.mllib.evaluation
Evaluator for binary classification.
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>, int) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
 
BinaryClassificationMetrics(RDD<Tuple2<Object, Object>>) - Constructor for class org.apache.spark.mllib.evaluation.BinaryClassificationMetrics
Defaults numBins to 0.
binaryFiles(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Read a directory of binary files from HDFS, a local file system (available on all nodes), or any Hadoop-supported file system URI as a byte array.
binaryFiles(String, int) - Method in class org.apache.spark.SparkContext
Get an RDD for a Hadoop-readable dataset as PortableDataStream for each file (useful for binary data)
binaryLabelValidator() - Static method in class org.apache.spark.mllib.util.DataValidators
Function to check if labels used for classification are either zero or one.
BinaryLogisticRegressionSummary - Class in org.apache.spark.ml.classification
:: Experimental :: Binary Logistic regression results for a given model.
BinaryLogisticRegressionTrainingSummary - Class in org.apache.spark.ml.classification
:: Experimental :: Logistic regression training results.
binaryRecords(String, int) - Method in class org.apache.spark.api.java.JavaSparkContext
Load data from a flat binary file, assuming the length of each record is constant.
binaryRecords(String, int, Configuration) - Method in class org.apache.spark.SparkContext
Load data from a flat binary file, assuming the length of each record is constant.
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files with fixed record lengths, yielding byte arrays
binaryRecordsStream(String, int) - Method in class org.apache.spark.streaming.StreamingContext
Create an input stream that monitors a Hadoop-compatible filesystem for new files and reads them as flat binary files, assuming a fixed length per record, generating one byte array per record.
BinarySample - Class in org.apache.spark.mllib.stat.test
Class that represents the group and value of a sample.
BinarySample(boolean, double) - Constructor for class org.apache.spark.mllib.stat.test.BinarySample
 
BinaryType - Class in org.apache.spark.sql.types
The data type representing Array[Byte] values.
BinaryType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the BinaryType object.
Binomial$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.Binomial$
 
BinomialBounds - Class in org.apache.spark.util.random
Utility functions that help us determine bounds on adjusted sampling rate to guarantee exact sample size with high confidence when sampling without replacement.
BinomialBounds() - Constructor for class org.apache.spark.util.random.BinomialBounds
 
BisectingKMeans - Class in org.apache.spark.ml.clustering
A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.
BisectingKMeans(String) - Constructor for class org.apache.spark.ml.clustering.BisectingKMeans
 
BisectingKMeans() - Constructor for class org.apache.spark.ml.clustering.BisectingKMeans
 
BisectingKMeans - Class in org.apache.spark.mllib.clustering
A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.
BisectingKMeans() - Constructor for class org.apache.spark.mllib.clustering.BisectingKMeans
Constructs with the default configuration
BisectingKMeansModel - Class in org.apache.spark.ml.clustering
Model fitted by BisectingKMeans.
BisectingKMeansModel - Class in org.apache.spark.mllib.clustering
Clustering model produced by BisectingKMeans.
BisectingKMeansModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.clustering
 
BisectingKMeansSummary - Class in org.apache.spark.ml.clustering
:: Experimental :: Summary of BisectingKMeans.
bitSize() - Method in class org.apache.spark.util.sketch.BloomFilter
Returns the number of bits in the underlying bit array.
bitwiseAND(Object) - Method in class org.apache.spark.sql.Column
Compute bitwise AND of this expression with another expression.
bitwiseNOT(Column) - Static method in class org.apache.spark.sql.functions
Computes bitwise NOT.
bitwiseOR(Object) - Method in class org.apache.spark.sql.Column
Compute bitwise OR of this expression with another expression.
bitwiseXOR(Object) - Method in class org.apache.spark.sql.Column
Compute bitwise XOR of this expression with another expression.
BLACKLISTED() - Static method in class org.apache.spark.ui.ToolTips
 
BlacklistedExecutor - Class in org.apache.spark.scheduler
 
BlacklistedExecutor(String, long) - Constructor for class org.apache.spark.scheduler.BlacklistedExecutor
 
BLAS - Class in org.apache.spark.ml.linalg
BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.ml.linalg.BLAS
 
BLAS - Class in org.apache.spark.mllib.linalg
BLAS routines for MLlib's vectors and matrices.
BLAS() - Constructor for class org.apache.spark.mllib.linalg.BLAS
 
BlockId - Class in org.apache.spark.storage
:: DeveloperApi :: Identifies a particular Block of data, usually associated with a single file.
BlockId() - Constructor for class org.apache.spark.storage.BlockId
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetBlockStatus
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocations
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBlock
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.ReplicateBlock
 
blockId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
 
blockIds() - Method in class org.apache.spark.storage.BlockManagerMessages.GetLocationsMultipleBlockIds
 
blockManager() - Method in class org.apache.spark.SparkEnv
 
blockManagerAddedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerAddedToJson(SparkListenerBlockManagerAdded) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerHeartbeat(BlockManagerId) - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
BlockManagerHeartbeat$() - Constructor for class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat$
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
blockManagerId() - Method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
BlockManagerId - Class in org.apache.spark.storage
:: DeveloperApi :: This class represent an unique identifier for a BlockManager.
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.GetPeers
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.RegisterBlockManager
 
blockManagerId() - Method in class org.apache.spark.storage.BlockManagerMessages.UpdateBlockInfo
 
blockManagerId() - Method in class org.apache.spark.storage.BlockUpdatedInfo
 
blockManagerId() - Method in class org.apache.spark.storage.StorageStatus
Deprecated.
 
blockManagerIdCache() - Static method in class org.apache.spark.storage.BlockManagerId
 
blockManagerIdFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerIds() - Method in class org.apache.spark.ui.jobs.JobProgressListener
Deprecated.
 
blockManagerIdToJson(BlockManagerId) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockManagerMessages - Class in org.apache.spark.storage
 
BlockManagerMessages() - Constructor for class org.apache.spark.storage.BlockManagerMessages
 
BlockManagerMessages.BlockManagerHeartbeat - Class in org.apache.spark.storage
 
BlockManagerMessages.BlockManagerHeartbeat$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus - Class in org.apache.spark.storage
 
BlockManagerMessages.GetBlockStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetExecutorEndpointRef - Class in org.apache.spark.storage
 
BlockManagerMessages.GetExecutorEndpointRef$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocations$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetLocationsMultipleBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMatchingBlockIds$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetMemoryStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers - Class in org.apache.spark.storage
 
BlockManagerMessages.GetPeers$ - Class in org.apache.spark.storage
 
BlockManagerMessages.GetStorageStatus$ - Class in org.apache.spark.storage
 
BlockManagerMessages.HasCachedBlocks - Class in org.apache.spark.storage
 
BlockManagerMessages.HasCachedBlocks$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager - Class in org.apache.spark.storage
 
BlockManagerMessages.RegisterBlockManager$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBlock$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveBroadcast$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveExecutor$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveRdd$ - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle - Class in org.apache.spark.storage
 
BlockManagerMessages.RemoveShuffle$ - Class in org.apache.spark.storage
 
BlockManagerMessages.ReplicateBlock - Class in org.apache.spark.storage
 
BlockManagerMessages.ReplicateBlock$ - Class in org.apache.spark.storage
 
BlockManagerMessages.StopBlockManagerMaster$ - Class in org.apache.spark.storage
 
BlockManagerMessages.ToBlockManagerMaster - Interface in org.apache.spark.storage
 
BlockManagerMessages.ToBlockManagerSlave - Interface in org.apache.spark.storage
 
BlockManagerMessages.TriggerThreadDump$ - Class in org.apache.spark.storage
Driver to Executor message to trigger a thread dump.
BlockManagerMessages.UpdateBlockInfo - Class in org.apache.spark.storage
 
BlockManagerMessages.UpdateBlockInfo$ - Class in org.apache.spark.storage
 
blockManagerRemovedFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockManagerRemovedToJson(SparkListenerBlockManagerRemoved) - Static method in class org.apache.spark.util.JsonProtocol
 
BlockMatrix - Class in org.apache.spark.mllib.linalg.distributed
Represents a distributed matrix in blocks of local matrices.
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
BlockMatrix(RDD<Tuple2<Tuple2<Object, Object>, Matrix>>, int, int) - Constructor for class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Alternate constructor for BlockMatrix without the input of the number of rows and columns.
blockName() - Method in class org.apache.spark.status.api.v1.RDDPartitionInfo
 
BlockNotFoundException - Exception in org.apache.spark.storage
 
BlockNotFoundException(String) - Constructor for exception org.apache.spark.storage.BlockNotFoundException
 
BlockReplicationPolicy - Interface in org.apache.spark.storage
::DeveloperApi:: BlockReplicationPrioritization provides logic for prioritizing a sequence of peers for replicating blocks.
BlockReplicationUtils - Class in org.apache.spark.storage
 
BlockReplicationUtils() - Constructor for class org.apache.spark.storage.BlockReplicationUtils
 
blocks() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
blocks() - Method in class org.apache.spark.storage.StorageStatus
Deprecated.
Return the blocks stored in this block manager.
blockSize() - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
 
BlockStatus - Class in org.apache.spark.storage
 
BlockStatus(StorageLevel, long, long) - Constructor for class org.apache.spark.storage.BlockStatus
 
blockStatusFromJson(JsonAST.JValue) - Static method in class org.apache.spark.util.JsonProtocol
 
blockStatusToJson(BlockStatus) - Static method in class org.apache.spark.util.JsonProtocol
 
blockUpdatedInfo() - Method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
 
BlockUpdatedInfo - Class in org.apache.spark.storage
:: DeveloperApi :: Stores information about a block status in a block manager.
BlockUpdatedInfo(BlockManagerId, BlockId, StorageLevel, long, long) - Constructor for class org.apache.spark.storage.BlockUpdatedInfo
 
bloomFilter(String, long, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Bloom filter over a specified column.
bloomFilter(Column, long, double) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Bloom filter over a specified column.
bloomFilter(String, long, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Bloom filter over a specified column.
bloomFilter(Column, long, long) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Bloom filter over a specified column.
BloomFilter - Class in org.apache.spark.util.sketch
A Bloom filter is a space-efficient probabilistic data structure that offers an approximate containment test with one-sided error: if it claims that an item is contained in it, this might be in error, but if it claims that an item is not contained in it, then this is definitely true.
BloomFilter() - Constructor for class org.apache.spark.util.sketch.BloomFilter
 
BloomFilter.Version - Enum in org.apache.spark.util.sketch
 
bmAddress() - Method in class org.apache.spark.FetchFailed
 
BOOLEAN() - Static method in class org.apache.spark.sql.Encoders
An encoder for nullable boolean type.
BooleanParam - Class in org.apache.spark.ml.param
:: DeveloperApi :: Specialized version of Param[Boolean] for Java.
BooleanParam(String, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanParam(Identifiable, String, String) - Constructor for class org.apache.spark.ml.param.BooleanParam
 
BooleanType - Class in org.apache.spark.sql.types
The data type representing Boolean values.
BooleanType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the BooleanType object.
boost(RDD<LabeledPoint>, RDD<LabeledPoint>, BoostingStrategy, boolean, long) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
Internal method for performing regression using trees as base learners.
BoostingStrategy - Class in org.apache.spark.mllib.tree.configuration
Configuration options for GradientBoostedTrees.
BoostingStrategy(Strategy, Loss, int, double, double) - Constructor for class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
Both() - Static method in class org.apache.spark.graphx.EdgeDirection
Edges originating from *and* arriving at a vertex of interest.
boundaries() - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
Boundaries in increasing order for which predictions are known.
boundaries() - Method in class org.apache.spark.mllib.regression.IsotonicRegressionModel
 
BoundedDouble - Class in org.apache.spark.partial
A Double value with error bars and associated confidence.
BoundedDouble(double, double, double, double) - Constructor for class org.apache.spark.partial.BoundedDouble
 
BreezeUtil - Class in org.apache.spark.ml.ann
In-place DGEMM and DGEMV for Breeze
BreezeUtil() - Constructor for class org.apache.spark.ml.ann.BreezeUtil
 
broadcast(T) - Method in class org.apache.spark.api.java.JavaSparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
Broadcast<T> - Class in org.apache.spark.broadcast
A broadcast variable.
Broadcast(long, ClassTag<T>) - Constructor for class org.apache.spark.broadcast.Broadcast
 
broadcast(T, ClassTag<T>) - Method in class org.apache.spark.SparkContext
Broadcast a read-only variable to the cluster, returning a Broadcast object for reading it in distributed functions.
broadcast(Dataset<T>) - Static method in class org.apache.spark.sql.functions
Marks a DataFrame as small enough for use in broadcast joins.
BROADCAST() - Static method in class org.apache.spark.storage.BlockId
 
BroadcastBlockId - Class in org.apache.spark.storage
 
BroadcastBlockId(long, String) - Constructor for class org.apache.spark.storage.BroadcastBlockId
 
broadcastId() - Method in class org.apache.spark.CleanBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BlockManagerMessages.RemoveBroadcast
 
broadcastId() - Method in class org.apache.spark.storage.BroadcastBlockId
 
broadcastManager() - Method in class org.apache.spark.SparkEnv
 
Broker - Class in org.apache.spark.streaming.kafka
Represents the host and port info for a Kafka broker.
bround(Column) - Static method in class org.apache.spark.sql.functions
Returns the value of the column e rounded to 0 decimal places with HALF_EVEN round mode.
bround(Column, int) - Static method in class org.apache.spark.sql.functions
Round the value of e to scale decimal places with HALF_EVEN round mode if scale is greater than or equal to 0 or at integral part when scale is less than 0.
bucketBy(int, String, String...) - Method in class org.apache.spark.sql.DataFrameWriter
Buckets the output by the given columns.
bucketBy(int, String, Seq<String>) - Method in class org.apache.spark.sql.DataFrameWriter
Buckets the output by the given columns.
BucketedRandomProjectionLSH - Class in org.apache.spark.ml.feature
:: Experimental ::
BucketedRandomProjectionLSH(String) - Constructor for class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
 
BucketedRandomProjectionLSH() - Constructor for class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
 
BucketedRandomProjectionLSHModel - Class in org.apache.spark.ml.feature
:: Experimental ::
Bucketizer - Class in org.apache.spark.ml.feature
Bucketizer maps a column of continuous features to a column of feature buckets.
Bucketizer(String) - Constructor for class org.apache.spark.ml.feature.Bucketizer
 
Bucketizer() - Constructor for class org.apache.spark.ml.feature.Bucketizer
 
bucketLength() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
 
bucketLength() - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
buffer() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
 
bufferEncoder() - Method in class org.apache.spark.sql.expressions.Aggregator
Specifies the Encoder for the intermediate value type.
BufferReleasingInputStream - Class in org.apache.spark.storage
Helper class that ensures a ManagedBuffer is released upon InputStream.close()
BufferReleasingInputStream(InputStream, ShuffleBlockFetcherIterator) - Constructor for class org.apache.spark.storage.BufferReleasingInputStream
 
bufferSchema() - Method in class org.apache.spark.sql.expressions.UserDefinedAggregateFunction
A StructType represents data types of values in the aggregation buffer.
build(Node, int) - Method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite.NodeData$
Create DecisionTreeModelReadWrite.NodeData instances for this node and all children.
build(DecisionTreeModel, int) - Method in class org.apache.spark.ml.tree.EnsembleModelReadWrite.EnsembleNodeData$
Create EnsembleModelReadWrite.EnsembleNodeData instances for the given tree.
build() - Method in class org.apache.spark.ml.tuning.ParamGridBuilder
Builds and returns all combinations of parameters specified by the param grid.
build() - Static method in class org.apache.spark.sql.hive.HiveSessionStateBuilder
 
build() - Method in class org.apache.spark.sql.types.MetadataBuilder
Builds the Metadata instance.
builder() - Static method in class org.apache.spark.sql.SparkSession
Creates a SparkSession.Builder for constructing a SparkSession.
Builder() - Constructor for class org.apache.spark.sql.SparkSession.Builder
 
buildReader(SparkSession, StructType, StructType, StructType, Seq<Filter>, Map<String, String>, Configuration) - Method in class org.apache.spark.sql.hive.orc.OrcFileFormat
 
buildScan(Seq<Attribute>, Seq<Expression>) - Method in interface org.apache.spark.sql.sources.CatalystScan
 
buildScan(String[], Filter[]) - Method in interface org.apache.spark.sql.sources.PrunedFilteredScan
 
buildScan(String[]) - Method in interface org.apache.spark.sql.sources.PrunedScan
 
buildScan() - Method in interface org.apache.spark.sql.sources.TableScan
 
buildTreeFromNodes(DecisionTreeModelReadWrite.NodeData[], String) - Static method in class org.apache.spark.ml.tree.DecisionTreeModelReadWrite
Given all data for all nodes in a tree, rebuild the tree.
BYTE() - Static method in class org.apache.spark.api.r.SerializationFormats
 
BYTE() - Static method in class org.apache.spark.sql.Encoders
An encoder for nullable byte type.
BytecodeUtils - Class in org.apache.spark.graphx.util
Includes an utility function to test whether a function accesses a specific attribute of an object.
BytecodeUtils() - Constructor for class org.apache.spark.graphx.util.BytecodeUtils
 
byteFromString(String, ByteUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
 
BYTES_READ() - Method in class org.apache.spark.InternalAccumulator.input$
 
BYTES_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.output$
 
BYTES_WRITTEN() - Method in class org.apache.spark.InternalAccumulator.shuffleWrite$
 
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetricDistributions
 
bytesRead() - Method in class org.apache.spark.status.api.v1.InputMetrics
 
bytesRead() - Method in class org.apache.spark.ui.jobs.UIData.InputMetricsUIData
 
bytesToString(long) - Static method in class org.apache.spark.util.Utils
Convert a quantity in bytes to a human-readable string such as "4.0 MB".
bytesToString(BigInt) - Static method in class org.apache.spark.util.Utils
 
byteStringAsBytes(String) - Static method in class org.apache.spark.util.Utils
Convert a passed byte string (e.g.
byteStringAsGb(String) - Static method in class org.apache.spark.util.Utils
Convert a passed byte string (e.g.
byteStringAsKb(String) - Static method in class org.apache.spark.util.Utils
Convert a passed byte string (e.g.
byteStringAsMb(String) - Static method in class org.apache.spark.util.Utils
Convert a passed byte string (e.g.
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetricDistributions
 
bytesWritten() - Method in class org.apache.spark.status.api.v1.OutputMetrics
 
bytesWritten() - Method in class org.apache.spark.status.api.v1.ShuffleWriteMetrics
 
bytesWritten() - Method in class org.apache.spark.ui.jobs.UIData.OutputMetricsUIData
 
bytesWritten() - Method in class org.apache.spark.ui.jobs.UIData.ShuffleWriteMetricsUIData
 
byteToString(long, ByteUnit) - Static method in class org.apache.spark.internal.config.ConfigHelpers
 
ByteType - Class in org.apache.spark.sql.types
The data type representing Byte values.
ByteType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the ByteType object.

C

cache() - Method in class org.apache.spark.api.java.JavaDoubleRDD
Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.api.java.JavaPairRDD
Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Method in class org.apache.spark.api.java.JavaRDD
Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Static method in class org.apache.spark.api.r.RRDD
 
cache() - Static method in class org.apache.spark.graphx.EdgeRDD
 
cache() - Method in class org.apache.spark.graphx.Graph
Caches the vertices and edges associated with this graph at the previously-specified target storage levels, which default to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
Persists the edge partitions using targetStorageLevel, which defaults to MEMORY_ONLY.
cache() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
cache() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
Persists the vertex partitions at targetStorageLevel, which defaults to MEMORY_ONLY.
cache() - Static method in class org.apache.spark.graphx.VertexRDD
 
cache() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
Caches the underlying RDD.
cache() - Static method in class org.apache.spark.rdd.HadoopRDD
 
cache() - Static method in class org.apache.spark.rdd.JdbcRDD
 
cache() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
cache() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
cache() - Method in class org.apache.spark.rdd.RDD
Persist this RDD with the default storage level (MEMORY_ONLY).
cache() - Static method in class org.apache.spark.rdd.UnionRDD
 
cache() - Method in class org.apache.spark.sql.Dataset
Persist this Dataset with the default storage level (MEMORY_AND_DISK).
cache() - Method in class org.apache.spark.streaming.api.java.JavaDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
cache() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cache() - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
cache() - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
cache() - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
cache() - Method in class org.apache.spark.streaming.dstream.DStream
Persist RDDs of this DStream with the default storage level (MEMORY_ONLY_SER)
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
 
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.GBTClassifier
 
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.GBTRegressor
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
 
cacheNodeIds() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
 
cacheSize() - Method in interface org.apache.spark.SparkExecutorInfo
 
cacheSize() - Method in class org.apache.spark.SparkExecutorInfoImpl
 
cacheSize() - Method in class org.apache.spark.storage.StorageStatus
Deprecated.
Return the memory used by caching RDDs
cacheTable(String) - Method in class org.apache.spark.sql.catalog.Catalog
Caches the specified table in-memory.
cacheTable(String) - Method in class org.apache.spark.sql.SQLContext
Caches the specified table in-memory.
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.classification.LinearSVCCostFun
 
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.classification.LogisticCostFun
 
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.AFTCostFun
 
calculate(DenseVector<Object>) - Method in class org.apache.spark.ml.regression.LeastSquaresCostFun
 
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Entropy
:: DeveloperApi :: variance calculation
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Gini
:: DeveloperApi :: variance calculation
calculate(double[], double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Method in interface org.apache.spark.mllib.tree.impurity.Impurity
:: DeveloperApi :: information calculation for regression
calculate(double[], double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: information calculation for multiclass classification
calculate(double, double, double) - Static method in class org.apache.spark.mllib.tree.impurity.Variance
:: DeveloperApi :: variance calculation
calculateNumberOfPartitions(long, int, int) - Method in class org.apache.spark.ml.feature.Word2VecModel.Word2VecModelWriter$
Calculate the number of partitions to use in saving the model.
CalendarIntervalType - Class in org.apache.spark.sql.types
The data type representing calendar time intervals.
CalendarIntervalType - Static variable in class org.apache.spark.sql.types.DataTypes
Gets the CalendarIntervalType object.
call(K, Iterator<V1>, Iterator<V2>) - Method in interface org.apache.spark.api.java.function.CoGroupFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.DoubleFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FilterFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.FlatMapFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.FlatMapFunction2
 
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsFunction
 
call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.FlatMapGroupsWithStateFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.ForeachFunction
 
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.ForeachPartitionFunction
 
call(T1) - Method in interface org.apache.spark.api.java.function.Function
 
call() - Method in interface org.apache.spark.api.java.function.Function0
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.Function2
 
call(T1, T2, T3) - Method in interface org.apache.spark.api.java.function.Function3
 
call(T1, T2, T3, T4) - Method in interface org.apache.spark.api.java.function.Function4
 
call(T) - Method in interface org.apache.spark.api.java.function.MapFunction
 
call(K, Iterator<V>) - Method in interface org.apache.spark.api.java.function.MapGroupsFunction
 
call(K, Iterator<V>, GroupState<S>) - Method in interface org.apache.spark.api.java.function.MapGroupsWithStateFunction
 
call(Iterator<T>) - Method in interface org.apache.spark.api.java.function.MapPartitionsFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFlatMapFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.PairFunction
 
call(T, T) - Method in interface org.apache.spark.api.java.function.ReduceFunction
 
call(T) - Method in interface org.apache.spark.api.java.function.VoidFunction
 
call(T1, T2) - Method in interface org.apache.spark.api.java.function.VoidFunction2
 
call(T1) - Method in interface org.apache.spark.sql.api.java.UDF1
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10) - Method in interface org.apache.spark.sql.api.java.UDF10
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11) - Method in interface org.apache.spark.sql.api.java.UDF11
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12) - Method in interface org.apache.spark.sql.api.java.UDF12
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13) - Method in interface org.apache.spark.sql.api.java.UDF13
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14) - Method in interface org.apache.spark.sql.api.java.UDF14
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15) - Method in interface org.apache.spark.sql.api.java.UDF15
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16) - Method in interface org.apache.spark.sql.api.java.UDF16
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17) - Method in interface org.apache.spark.sql.api.java.UDF17
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18) - Method in interface org.apache.spark.sql.api.java.UDF18
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19) - Method in interface org.apache.spark.sql.api.java.UDF19
 
call(T1, T2) - Method in interface org.apache.spark.sql.api.java.UDF2
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20) - Method in interface org.apache.spark.sql.api.java.UDF20
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21) - Method in interface org.apache.spark.sql.api.java.UDF21
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9, T10, T11, T12, T13, T14, T15, T16, T17, T18, T19, T20, T21, T22) - Method in interface org.apache.spark.sql.api.java.UDF22
 
call(T1, T2, T3) - Method in interface org.apache.spark.sql.api.java.UDF3
 
call(T1, T2, T3, T4) - Method in interface org.apache.spark.sql.api.java.UDF4
 
call(T1, T2, T3, T4, T5) - Method in interface org.apache.spark.sql.api.java.UDF5
 
call(T1, T2, T3, T4, T5, T6) - Method in interface org.apache.spark.sql.api.java.UDF6
 
call(T1, T2, T3, T4, T5, T6, T7) - Method in interface org.apache.spark.sql.api.java.UDF7
 
call(T1, T2, T3, T4, T5, T6, T7, T8) - Method in interface org.apache.spark.sql.api.java.UDF8
 
call(T1, T2, T3, T4, T5, T6, T7, T8, T9) - Method in interface org.apache.spark.sql.api.java.UDF9
 
callSite() - Method in class org.apache.spark.storage.RDDInfo
 
callUDF(String, Column...) - Static method in class org.apache.spark.sql.functions
Call an user-defined function.
callUDF(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
Call an user-defined function.
cancel() - Method in class org.apache.spark.ComplexFutureAction
 
cancel() - Method in interface org.apache.spark.FutureAction
Cancels the execution of this action.
cancel() - Method in class org.apache.spark.SimpleFutureAction
 
cancelAllJobs() - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel all jobs that have been scheduled or are running.
cancelAllJobs() - Method in class org.apache.spark.SparkContext
Cancel all jobs that have been scheduled or are running.
cancelJob(int, String) - Method in class org.apache.spark.SparkContext
Cancel a given job if it's scheduled or running.
cancelJob(int) - Method in class org.apache.spark.SparkContext
Cancel a given job if it's scheduled or running.
cancelJobGroup(String) - Method in class org.apache.spark.api.java.JavaSparkContext
Cancel active jobs for the specified group.
cancelJobGroup(String) - Method in class org.apache.spark.SparkContext
Cancel active jobs for the specified group.
cancelStage(int, String) - Method in class org.apache.spark.SparkContext
Cancel a given stage and all jobs associated with it.
cancelStage(int) - Method in class org.apache.spark.SparkContext
Cancel a given stage and all jobs associated with it.
canEqual(Object) - Static method in class org.apache.spark.Aggregator
 
canEqual(Object) - Static method in class org.apache.spark.CleanAccum
 
canEqual(Object) - Static method in class org.apache.spark.CleanBroadcast
 
canEqual(Object) - Static method in class org.apache.spark.CleanCheckpoint
 
canEqual(Object) - Static method in class org.apache.spark.CleanRDD
 
canEqual(Object) - Static method in class org.apache.spark.CleanShuffle
 
canEqual(Object) - Static method in class org.apache.spark.ExceptionFailure
 
canEqual(Object) - Static method in class org.apache.spark.ExecutorLostFailure
 
canEqual(Object) - Static method in class org.apache.spark.ExecutorRegistered
 
canEqual(Object) - Static method in class org.apache.spark.ExecutorRemoved
 
canEqual(Object) - Static method in class org.apache.spark.ExpireDeadHosts
 
canEqual(Object) - Static method in class org.apache.spark.FetchFailed
 
canEqual(Object) - Static method in class org.apache.spark.graphx.Edge
 
canEqual(Object) - Static method in class org.apache.spark.ml.feature.Dot
 
canEqual(Object) - Static method in class org.apache.spark.ml.feature.LabeledPoint
 
canEqual(Object) - Static method in class org.apache.spark.ml.param.ParamPair
 
canEqual(Object) - Static method in class org.apache.spark.mllib.feature.VocabWord
 
canEqual(Object) - Static method in class org.apache.spark.mllib.linalg.distributed.IndexedRow
 
canEqual(Object) - Static method in class org.apache.spark.mllib.linalg.distributed.MatrixEntry
 
canEqual(Object) - Static method in class org.apache.spark.mllib.linalg.QRDecomposition
 
canEqual(Object) - Static method in class org.apache.spark.mllib.linalg.SingularValueDecomposition
 
canEqual(Object) - Static method in class org.apache.spark.mllib.recommendation.Rating
 
canEqual(Object) - Static method in class org.apache.spark.mllib.regression.LabeledPoint
 
canEqual(Object) - Static method in class org.apache.spark.mllib.stat.test.BinarySample
 
canEqual(Object) - Static method in class org.apache.spark.mllib.tree.configuration.BoostingStrategy
 
canEqual(Object) - Static method in class org.apache.spark.mllib.tree.model.Split
 
canEqual(Object) - Static method in class org.apache.spark.Resubmitted
 
canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStart
 
canEqual(Object) - Static method in class org.apache.spark.rpc.netty.OnStop
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.AccumulableInfo
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.AllJobsCancelled
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.AskPermissionToCommitOutput
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.BlacklistedExecutor
 
canEqual(Object) - Method in class org.apache.spark.scheduler.cluster.ExecutorInfo
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.JobSucceeded
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.local.KillTask
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.local.ReviveOffers
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.local.StatusUpdate
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.local.StopExecutor
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.ResubmitFailedStages
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.RuntimePercentage
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationEnd
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerApplicationStart
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerAdded
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerBlockManagerRemoved
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerBlockUpdated
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorAdded
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorBlacklisted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorMetricsUpdate
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorRemoved
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerExecutorUnblacklisted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerJobEnd
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerJobStart
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerNodeBlacklisted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerNodeUnblacklisted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerStageCompleted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerStageSubmitted
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerTaskEnd
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerTaskGettingResult
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerTaskStart
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.SparkListenerUnpersistRDD
 
canEqual(Object) - Static method in class org.apache.spark.scheduler.StopCoordinator
 
canEqual(Object) - Static method in class org.apache.spark.sql.DatasetHolder
 
canEqual(Object) - Static method in class org.apache.spark.sql.expressions.UserDefinedFunction
 
canEqual(Object) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
canEqual(Object) - Static method in class org.apache.spark.sql.hive.execution.HiveScriptIOSchema
 
canEqual(Object) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
canEqual(Object) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
canEqual(Object) - Static method in class org.apache.spark.sql.hive.RelationConversions
 
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.JdbcType
 
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
 
canEqual(Object) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.And
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.EqualNullSafe
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.EqualTo
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.GreaterThan
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.GreaterThanOrEqual
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.In
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.IsNotNull
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.IsNull
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.LessThan
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.LessThanOrEqual
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.Not
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.Or
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.StringContains
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.StringEndsWith
 
canEqual(Object) - Static method in class org.apache.spark.sql.sources.StringStartsWith
 
canEqual(Object) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
Deprecated.
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.ArrayType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.CharType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.DecimalType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.MapType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.ObjectType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.StructField
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.StructType
 
canEqual(Object) - Static method in class org.apache.spark.sql.types.VarcharType
 
canEqual(Object) - Static method in class org.apache.spark.StopMapOutputTracker
 
canEqual(Object) - Static method in class org.apache.spark.storage.BlockStatus
 
canEqual(Object) - Static method in class org.apache.spark.storage.BlockUpdatedInfo
 
canEqual(Object) - Static method in class org.apache.spark.storage.BroadcastBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
 
canEqual(Object) - Static method in class org.apache.spark.storage.memory.SerializedMemoryEntry
 
canEqual(Object) - Static method in class org.apache.spark.storage.RDDBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.ShuffleBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.ShuffleDataBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.ShuffleIndexBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.StreamBlockId
 
canEqual(Object) - Static method in class org.apache.spark.storage.TaskResultBlockId
 
canEqual(Object) - Static method in class org.apache.spark.streaming.Duration
 
canEqual(Object) - Static method in class org.apache.spark.streaming.kinesis.DefaultCredentials
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.AllReceiverIds
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.BatchInfo
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.GetAllReceiverInfo
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.OutputOperationInfo
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.ReceiverInfo
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StopAllReceivers
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchCompleted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchStarted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerBatchSubmitted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationCompleted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerOutputOperationStarted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverError
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStarted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerReceiverStopped
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamingListenerStreamingStarted
 
canEqual(Object) - Static method in class org.apache.spark.streaming.scheduler.StreamInputInfo
 
canEqual(Object) - Static method in class org.apache.spark.streaming.Time
 
canEqual(Object) - Static method in class org.apache.spark.Success
 
canEqual(Object) - Static method in class org.apache.spark.TaskCommitDenied
 
canEqual(Object) - Static method in class org.apache.spark.TaskKilled
 
canEqual(Object) - Static method in class org.apache.spark.TaskResultLost
 
canEqual(Object) - Static method in class org.apache.spark.TaskSchedulerIsSet
 
canEqual(Object) - Static method in class org.apache.spark.UnknownReason
 
canEqual(Object) - Static method in class org.apache.spark.util.MethodIdentifier
 
canEqual(Object) - Method in class org.apache.spark.util.MutablePair
 
canHandle(String) - Method in class org.apache.spark.sql.jdbc.AggregatedDialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DB2Dialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.DerbyDialect
 
canHandle(String) - Method in class org.apache.spark.sql.jdbc.JdbcDialect
Check if this dialect instance can handle a certain jdbc url.
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MsSqlServerDialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.MySQLDialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.NoopDialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.OracleDialect
 
canHandle(String) - Static method in class org.apache.spark.sql.jdbc.PostgresDialect
 
canonicalized() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
canonicalized() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
canonicalized() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
CanonicalRandomVertexCut$() - Constructor for class org.apache.spark.graphx.PartitionStrategy.CanonicalRandomVertexCut$
 
cartesian(JavaRDDLike<U, ?>) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
cartesian(JavaRDDLike<U, ?>) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
cartesian(JavaRDDLike<U, ?>) - Static method in class org.apache.spark.api.java.JavaRDD
 
cartesian(JavaRDDLike<U, ?>) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.api.r.RRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.VertexRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
cartesian(RDD<U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in this and b is in other.
cartesian(RDD<U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.UnionRDD
 
caseSensitive() - Method in class org.apache.spark.ml.feature.StopWordsRemover
Whether to do a case sensitive comparison over the stop words.
cast(DataType) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type.
cast(String) - Method in class org.apache.spark.sql.Column
Casts the column to a different data type, using the canonical string representation of the type.
Catalog - Class in org.apache.spark.sql.catalog
Catalog interface for Spark.
Catalog() - Constructor for class org.apache.spark.sql.catalog.Catalog
 
catalog() - Method in class org.apache.spark.sql.SparkSession
Interface through which the user may create, drop, alter or query underlying databases, tables, functions etc.
catalogString() - Method in class org.apache.spark.sql.types.ArrayType
 
catalogString() - Static method in class org.apache.spark.sql.types.BinaryType
 
catalogString() - Static method in class org.apache.spark.sql.types.BooleanType
 
catalogString() - Static method in class org.apache.spark.sql.types.ByteType
 
catalogString() - Static method in class org.apache.spark.sql.types.CalendarIntervalType
 
catalogString() - Static method in class org.apache.spark.sql.types.CharType
 
catalogString() - Method in class org.apache.spark.sql.types.DataType
String representation for the type saved in external catalogs.
catalogString() - Static method in class org.apache.spark.sql.types.DateType
 
catalogString() - Static method in class org.apache.spark.sql.types.DecimalType
 
catalogString() - Static method in class org.apache.spark.sql.types.DoubleType
 
catalogString() - Static method in class org.apache.spark.sql.types.FloatType
 
catalogString() - Static method in class org.apache.spark.sql.types.HiveStringType
 
catalogString() - Static method in class org.apache.spark.sql.types.IntegerType
 
catalogString() - Static method in class org.apache.spark.sql.types.LongType
 
catalogString() - Method in class org.apache.spark.sql.types.MapType
 
catalogString() - Static method in class org.apache.spark.sql.types.NullType
 
catalogString() - Static method in class org.apache.spark.sql.types.NumericType
 
catalogString() - Static method in class org.apache.spark.sql.types.ObjectType
 
catalogString() - Static method in class org.apache.spark.sql.types.ShortType
 
catalogString() - Static method in class org.apache.spark.sql.types.StringType
 
catalogString() - Method in class org.apache.spark.sql.types.StructType
 
catalogString() - Static method in class org.apache.spark.sql.types.TimestampType
 
catalogString() - Static method in class org.apache.spark.sql.types.VarcharType
 
CatalystScan - Interface in org.apache.spark.sql.sources
::Experimental:: An interface for experimenting with a more direct connection to the query planner.
Categorical() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
categoricalFeaturesInfo() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
CategoricalSplit - Class in org.apache.spark.ml.tree
Split which tests a categorical feature.
categories() - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$.SplitData
 
categories() - Method in class org.apache.spark.mllib.tree.model.Split
 
categoryMaps() - Method in class org.apache.spark.ml.feature.VectorIndexerModel
 
cause() - Method in exception org.apache.spark.sql.AnalysisException
 
cause() - Method in exception org.apache.spark.sql.streaming.StreamingQueryException
 
CausedBy - Class in org.apache.spark.util
Extractor Object for pulling out the root cause of an error.
CausedBy() - Constructor for class org.apache.spark.util.CausedBy
 
cbrt(Column) - Static method in class org.apache.spark.sql.functions
Computes the cube-root of the given value.
cbrt(String) - Static method in class org.apache.spark.sql.functions
Computes the cube-root of the given column.
ceil(Column) - Static method in class org.apache.spark.sql.functions
Computes the ceiling of the given value.
ceil(String) - Static method in class org.apache.spark.sql.functions
Computes the ceiling of the given column.
ceil() - Method in class org.apache.spark.sql.types.Decimal
 
censorCol() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
 
censorCol() - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
 
chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, T, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
chainl1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<U>>, Function0<Parsers.Parser<Function2<T, U, T>>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
chainr1(Function0<Parsers.Parser<T>>, Function0<Parsers.Parser<Function2<T, U, U>>>, Function2<T, U, U>, U) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
changePrecision(int, int) - Method in class org.apache.spark.sql.types.Decimal
Update precision and scale while keeping our value the same, and return true if successful.
CharType - Class in org.apache.spark.sql.types
Hive char type.
CharType(int) - Constructor for class org.apache.spark.sql.types.CharType
 
checkColumnNameDuplication(Seq<String>, String, boolean) - Static method in class org.apache.spark.sql.util.SchemaUtils
Checks if input column names have duplicate identifiers.
checkColumnType(StructType, String, DataType, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
Check whether the given schema contains a column of the required data type.
checkColumnTypes(StructType, String, Seq<DataType>, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
Check whether the given schema contains a column of one of the require data types.
checkDataColumns(RFormula, Dataset<?>) - Static method in class org.apache.spark.ml.r.RWrapperUtils
DataFrame column check.
checkErrors(Either<ArrayBuffer<Throwable>, T>) - Static method in class org.apache.spark.streaming.kafka.KafkaCluster
If the result is right, return it, otherwise throw SparkException
checkFileExists(String, Configuration) - Static method in class org.apache.spark.streaming.util.HdfsUtils
Check if the file exists at the given path.
checkHost(String, String) - Static method in class org.apache.spark.util.Utils
 
checkHostPort(String, String) - Static method in class org.apache.spark.util.Utils
 
checkNumericType(StructType, String, String) - Static method in class org.apache.spark.ml.util.SchemaUtils
Check whether the given schema contains a column of the numeric data type.
checkpoint() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
checkpoint() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
checkpoint() - Static method in class org.apache.spark.api.java.JavaRDD
 
checkpoint() - Method in interface org.apache.spark.api.java.JavaRDDLike
Mark this RDD for checkpointing.
checkpoint() - Static method in class org.apache.spark.api.r.RRDD
 
checkpoint() - Static method in class org.apache.spark.graphx.EdgeRDD
 
checkpoint() - Method in class org.apache.spark.graphx.Graph
Mark this Graph for checkpointing.
checkpoint() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.GraphImpl
 
checkpoint() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
checkpoint() - Static method in class org.apache.spark.graphx.VertexRDD
 
checkpoint() - Method in class org.apache.spark.rdd.HadoopRDD
 
checkpoint() - Static method in class org.apache.spark.rdd.JdbcRDD
 
checkpoint() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
checkpoint() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
checkpoint() - Method in class org.apache.spark.rdd.RDD
Mark this RDD for checkpointing.
checkpoint() - Static method in class org.apache.spark.rdd.UnionRDD
 
checkpoint() - Method in class org.apache.spark.sql.Dataset
Eagerly checkpoint a Dataset and return the new Dataset.
checkpoint(boolean) - Method in class org.apache.spark.sql.Dataset
Returns a checkpointed version of this Dataset.
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
checkpoint(Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Enable periodic checkpointing of RDDs of this DStream.
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
checkpoint(Duration) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
checkpoint(String) - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
Sets the context to periodically checkpoint the DStream operations for master fault-tolerance.
checkpoint(Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Enable periodic checkpointing of RDDs of this DStream
checkpoint(String) - Method in class org.apache.spark.streaming.StreamingContext
Set the context to periodically checkpoint the DStream operations for driver fault-tolerance.
Checkpointed() - Static method in class org.apache.spark.rdd.CheckpointState
 
CheckpointingInProgress() - Static method in class org.apache.spark.rdd.CheckpointState
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.GBTClassifier
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
 
checkpointInterval() - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.clustering.LDA
 
checkpointInterval() - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.recommendation.ALS
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.GBTRegressor
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
 
checkpointInterval() - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
 
checkpointInterval() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
 
CheckpointReader - Class in org.apache.spark.streaming
 
CheckpointReader() - Constructor for class org.apache.spark.streaming.CheckpointReader
 
CheckpointState - Class in org.apache.spark.rdd
Enumeration to manage state transitions of an RDD through checkpointing
CheckpointState() - Constructor for class org.apache.spark.rdd.CheckpointState
 
checkState(boolean, Function0<String>) - Static method in class org.apache.spark.streaming.util.HdfsUtils
 
child() - Method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
child() - Method in class org.apache.spark.sql.sources.Not
 
CHILD_CONNECTION_TIMEOUT - Static variable in class org.apache.spark.launcher.SparkLauncher
Maximum time (in ms) to wait for a child process to connect back to the launcher server when using @link{#start()}.
CHILD_PROCESS_LOGGER_NAME - Static variable in class org.apache.spark.launcher.SparkLauncher
Logger name to use when launching a child process.
children() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
children() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
children() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
childrenResolved() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
childrenResolved() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
chiSqFunc() - Method in class org.apache.spark.mllib.stat.test.ChiSqTest.Method
 
ChiSqSelector - Class in org.apache.spark.ml.feature
Chi-Squared feature selection, which selects categorical features to use for predicting a categorical label.
ChiSqSelector(String) - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
 
ChiSqSelector() - Constructor for class org.apache.spark.ml.feature.ChiSqSelector
 
ChiSqSelector - Class in org.apache.spark.mllib.feature
Creates a ChiSquared feature selector.
ChiSqSelector() - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
 
ChiSqSelector(int) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelector
The is the same to call this() and setNumTopFeatures(numTopFeatures)
ChiSqSelectorModel - Class in org.apache.spark.ml.feature
Model fitted by ChiSqSelector.
ChiSqSelectorModel - Class in org.apache.spark.mllib.feature
Chi Squared selector model.
ChiSqSelectorModel(int[]) - Constructor for class org.apache.spark.mllib.feature.ChiSqSelectorModel
 
ChiSqSelectorModel.SaveLoadV1_0$ - Class in org.apache.spark.mllib.feature
 
ChiSqSelectorModel.SaveLoadV1_0$.Data - Class in org.apache.spark.mllib.feature
Model data for import/export
chiSqTest(Vector, Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the expected distribution.
chiSqTest(Vector) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's chi-squared goodness of fit test of the observed data against the uniform distribution, with each category having an expected frequency of 1 / observed.size.
chiSqTest(Matrix) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test on the input contingency matrix, which cannot contain negative entries or columns or rows that sum up to 0.
chiSqTest(RDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
Conduct Pearson's independence test for every feature against the label across the input RDD.
chiSqTest(JavaRDD<LabeledPoint>) - Static method in class org.apache.spark.mllib.stat.Statistics
Java-friendly version of chiSqTest()
ChiSqTest - Class in org.apache.spark.mllib.stat.test
Conduct the chi-squared test for the input RDDs using the specified method.
ChiSqTest() - Constructor for class org.apache.spark.mllib.stat.test.ChiSqTest
 
ChiSqTest.Method - Class in org.apache.spark.mllib.stat.test
param: name String name for the method.
ChiSqTest.Method$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTest.NullHypothesis$ - Class in org.apache.spark.mllib.stat.test
 
ChiSqTestResult - Class in org.apache.spark.mllib.stat.test
Object containing the test results for the chi-squared hypothesis test.
chiSquared(Vector, Vector, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
chiSquaredFeatures(RDD<LabeledPoint>, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
Conduct Pearson's independence test for each feature against the label across the input RDD.
chiSquaredMatrix(Matrix, String) - Static method in class org.apache.spark.mllib.stat.test.ChiSqTest
 
ChiSquareTest - Class in org.apache.spark.ml.stat
:: Experimental ::
ChiSquareTest() - Constructor for class org.apache.spark.ml.stat.ChiSquareTest
 
chmod700(File) - Static method in class org.apache.spark.util.Utils
JDK equivalent of chmod 700 file.
CholeskyDecomposition - Class in org.apache.spark.mllib.linalg
Compute Cholesky decomposition.
CholeskyDecomposition() - Constructor for class org.apache.spark.mllib.linalg.CholeskyDecomposition
 
classForName(String) - Static method in class org.apache.spark.util.Utils
Preferred alternative to Class.forName(className)
Classification() - Static method in class org.apache.spark.mllib.tree.configuration.Algo
 
ClassificationModel<FeaturesType,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: DeveloperApi ::
ClassificationModel() - Constructor for class org.apache.spark.ml.classification.ClassificationModel
 
ClassificationModel - Interface in org.apache.spark.mllib.classification
Represents a classification model that predicts to which of a set of categories an example belongs.
Classifier<FeaturesType,E extends Classifier<FeaturesType,E,M>,M extends ClassificationModel<FeaturesType,M>> - Class in org.apache.spark.ml.classification
:: DeveloperApi ::
Classifier() - Constructor for class org.apache.spark.ml.classification.Classifier
 
classifier() - Static method in class org.apache.spark.ml.classification.OneVsRest
 
classifier() - Static method in class org.apache.spark.ml.classification.OneVsRestModel
 
classIsLoadable(String) - Static method in class org.apache.spark.util.Utils
Determines whether the provided class is loadable in the current thread.
className() - Method in class org.apache.spark.ExceptionFailure
 
className() - Method in class org.apache.spark.sql.catalog.Function
 
classpathEntries() - Method in class org.apache.spark.status.api.v1.ApplicationEnvironmentInfo
 
classpathEntries() - Method in class org.apache.spark.ui.env.EnvironmentListener
Deprecated.
 
classTag() - Method in class org.apache.spark.api.java.JavaDoubleRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaPairRDD
 
classTag() - Method in class org.apache.spark.api.java.JavaRDD
 
classTag() - Method in interface org.apache.spark.api.java.JavaRDDLike
 
classTag() - Method in class org.apache.spark.sql.Dataset
 
classTag() - Method in class org.apache.spark.storage.memory.DeserializedMemoryEntry
 
classTag() - Method in interface org.apache.spark.storage.memory.MemoryEntry
 
classTag() - Method in class org.apache.spark.storage.memory.SerializedMemoryEntry
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaDStream
 
classTag() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
classTag() - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
classTag() - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
classTag() - Method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
clean(long, boolean) - Method in class org.apache.spark.streaming.util.WriteAheadLog
Clean all the records that are older than the threshold time.
clean(Object, boolean, boolean) - Static method in class org.apache.spark.util.ClosureCleaner
Clean the given closure in place.
CleanAccum - Class in org.apache.spark
 
CleanAccum(long) - Constructor for class org.apache.spark.CleanAccum
 
CleanBroadcast - Class in org.apache.spark
 
CleanBroadcast(long) - Constructor for class org.apache.spark.CleanBroadcast
 
CleanCheckpoint - Class in org.apache.spark
 
CleanCheckpoint(int) - Constructor for class org.apache.spark.CleanCheckpoint
 
CleanRDD - Class in org.apache.spark
 
CleanRDD(int) - Constructor for class org.apache.spark.CleanRDD
 
CleanShuffle - Class in org.apache.spark
 
CleanShuffle(int) - Constructor for class org.apache.spark.CleanShuffle
 
CleanupTask - Interface in org.apache.spark
Classes that represent cleaning tasks.
CleanupTaskWeakReference - Class in org.apache.spark
A WeakReference associated with a CleanupTask.
CleanupTaskWeakReference(CleanupTask, Object, ReferenceQueue<Object>) - Constructor for class org.apache.spark.CleanupTaskWeakReference
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.DecisionTreeClassifier
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.GBTClassificationModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.GBTClassifier
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.LinearSVC
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.LinearSVCModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.LogisticRegression
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.NaiveBayes
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.NaiveBayesModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.OneVsRest
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.OneVsRestModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.RandomForestClassificationModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.classification.RandomForestClassifier
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.BisectingKMeans
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.BisectingKMeansModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.DistributedLDAModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.GaussianMixture
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.GaussianMixtureModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.KMeans
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.KMeansModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.LDA
 
clear(Param<?>) - Static method in class org.apache.spark.ml.clustering.LocalLDAModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
clear(Param<?>) - Static method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
 
clear(Param<?>) - Static method in class org.apache.spark.ml.evaluation.RegressionEvaluator
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Binarizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Bucketizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.ChiSqSelector
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.ChiSqSelectorModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.ColumnPruner
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.CountVectorizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.CountVectorizerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.DCT
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.HashingTF
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.IDF
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.IDFModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Imputer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.ImputerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.IndexToString
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Interaction
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MaxAbsScaler
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MaxAbsScalerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MinHashLSH
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MinHashLSHModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MinMaxScaler
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.MinMaxScalerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.NGram
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Normalizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.OneHotEncoder
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.PCA
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.PCAModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.PolynomialExpansion
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.QuantileDiscretizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.RegexTokenizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.RFormula
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.RFormulaModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.SQLTransformer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.StandardScaler
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.StandardScalerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.StopWordsRemover
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.StringIndexer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.StringIndexerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Tokenizer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.VectorAssembler
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.VectorAttributeRewriter
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.VectorIndexer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.VectorIndexerModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.VectorSlicer
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Word2Vec
 
clear(Param<?>) - Static method in class org.apache.spark.ml.feature.Word2VecModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.fpm.FPGrowth
 
clear(Param<?>) - Static method in class org.apache.spark.ml.fpm.FPGrowthModel
 
clear(Param<?>) - Method in interface org.apache.spark.ml.param.Params
Clears the user-supplied value for the input param.
clear(Param<?>) - Static method in class org.apache.spark.ml.Pipeline
 
clear(Param<?>) - Static method in class org.apache.spark.ml.PipelineModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.recommendation.ALS
 
clear(Param<?>) - Static method in class org.apache.spark.ml.recommendation.ALSModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegression
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.DecisionTreeRegressor
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.GBTRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.GBTRegressor
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.IsotonicRegression
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.IsotonicRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.LinearRegression
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.LinearRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.RandomForestRegressionModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.regression.RandomForestRegressor
 
clear(Param<?>) - Static method in class org.apache.spark.ml.tuning.CrossValidator
 
clear(Param<?>) - Static method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
clear(Param<?>) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplit
 
clear(Param<?>) - Static method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
 
clear() - Method in class org.apache.spark.sql.util.ExecutionListenerManager
Removes all the registered QueryExecutionListener.
clear() - Static method in class org.apache.spark.util.AccumulatorContext
Clears all registered AccumulatorV2s.
clearActive() - Static method in class org.apache.spark.sql.SQLContext
Deprecated.
Use SparkSession.clearActiveSession instead. Since 2.0.0.
clearActiveSession() - Static method in class org.apache.spark.sql.SparkSession
Clears the active SparkSession for current thread.
clearCache() - Method in class org.apache.spark.sql.catalog.Catalog
Removes all cached tables from the in-memory cache.
clearCache() - Method in class org.apache.spark.sql.SQLContext
Removes all cached tables from the in-memory cache.
clearCallSite() - Method in class org.apache.spark.api.java.JavaSparkContext
Pass-through to SparkContext.setCallSite.
clearCallSite() - Method in class org.apache.spark.SparkContext
Clear the thread-local property for overriding the call sites of actions and RDDs.
clearDefaultSession() - Static method in class org.apache.spark.sql.SparkSession
Clears the default SparkSession that is returned by the builder.
clearDependencies() - Method in class org.apache.spark.rdd.CoGroupedRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.ShuffledRDD
 
clearDependencies() - Method in class org.apache.spark.rdd.UnionRDD
 
clearJobGroup() - Method in class org.apache.spark.api.java.JavaSparkContext
Clear the current thread's job group ID and its description.
clearJobGroup() - Method in class org.apache.spark.SparkContext
Clear the current thread's job group ID and its description.
clearThreshold() - Method in class org.apache.spark.mllib.classification.LogisticRegressionModel
Clears the threshold so that predict will output raw prediction scores.
clearThreshold() - Method in class org.apache.spark.mllib.classification.SVMModel
Clears the threshold so that predict will output raw prediction scores.
CLogLog$() - Constructor for class org.apache.spark.ml.regression.GeneralizedLinearRegression.CLogLog$
 
clone() - Method in class org.apache.spark.SparkConf
Copy this object
clone() - Method in class org.apache.spark.sql.ExperimentalMethods
 
clone() - Method in class org.apache.spark.sql.types.Decimal
 
clone() - Method in class org.apache.spark.sql.util.ExecutionListenerManager
Get an identical copy of this listener manager.
clone() - Method in class org.apache.spark.storage.StorageLevel
 
clone() - Method in class org.apache.spark.util.random.BernoulliCellSampler
 
clone() - Method in class org.apache.spark.util.random.BernoulliSampler
 
clone() - Method in class org.apache.spark.util.random.PoissonSampler
 
clone() - Method in interface org.apache.spark.util.random.RandomSampler
return a copy of the RandomSampler object
clone(T, SerializerInstance, ClassTag<T>) - Static method in class org.apache.spark.util.Utils
Clone an object using a Spark serializer.
cloneComplement() - Method in class org.apache.spark.util.random.BernoulliCellSampler
Return a sampler that is the complement of the range specified of the current sampler.
close() - Method in class org.apache.spark.api.java.JavaSparkContext
 
close() - Method in class org.apache.spark.io.NioBufferedFileInputStream
 
close() - Method in class org.apache.spark.io.SnappyOutputStreamWrapper
 
close() - Method in class org.apache.spark.serializer.DeserializationStream
 
close() - Method in class org.apache.spark.serializer.SerializationStream
 
close(Throwable) - Method in class org.apache.spark.sql.ForeachWriter
Called when stopping to process one partition of new data in the executor side.
close() - Method in class org.apache.spark.sql.hive.execution.HiveOutputWriter
 
close() - Method in class org.apache.spark.sql.SparkSession
Synonym for stop().
close() - Method in class org.apache.spark.storage.BufferReleasingInputStream
 
close() - Method in class org.apache.spark.storage.CountingWritableChannel
 
close() - Method in class org.apache.spark.storage.TimeTrackingOutputStream
 
close() - Method in class org.apache.spark.streaming.api.java.JavaStreamingContext
 
close() - Method in class org.apache.spark.streaming.util.WriteAheadLog
Close this log and release any resources.
ClosureCleaner - Class in org.apache.spark.util
A cleaner that renders closures serializable if they can be done so safely.
ClosureCleaner() - Constructor for class org.apache.spark.util.ClosureCleaner
 
closureSerializer() - Method in class org.apache.spark.SparkEnv
 
cls() - Method in class org.apache.spark.sql.types.ObjectType
 
cls() - Method in class org.apache.spark.util.MethodIdentifier
 
clsTag() - Method in interface org.apache.spark.sql.Encoder
A ClassTag that can be used to construct and Array to contain a collection of T.
cluster() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
Cluster centers of the transformed data.
cluster() - Method in class org.apache.spark.mllib.clustering.PowerIterationClustering.Assignment
 
clusterCenters() - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
 
clusterCenters() - Method in class org.apache.spark.ml.clustering.KMeansModel
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
Leaf cluster centers.
clusterCenters() - Method in class org.apache.spark.mllib.clustering.KMeansModel
 
clusterCenters() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
ClusteringSummary - Class in org.apache.spark.ml.clustering
:: Experimental :: Summary of clustering algorithms.
clusterSizes() - Method in class org.apache.spark.ml.clustering.ClusteringSummary
Size of (number of data points in) each cluster.
clusterWeights() - Method in class org.apache.spark.mllib.clustering.StreamingKMeansModel
 
cn() - Method in class org.apache.spark.mllib.feature.VocabWord
 
coalesce(int) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaDoubleRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaPairRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean) - Method in class org.apache.spark.api.java.JavaRDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.api.r.RRDD
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.graphx.VertexRDD
 
coalesce(int, RDD<?>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
Runs the packing algorithm and returns an array of PartitionGroups that if possible are load balanced and grouped by locality
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
coalesce(int, RDD<?>) - Method in interface org.apache.spark.rdd.PartitionCoalescer
Coalesce the partitions of the given RDD.
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return a new RDD that is reduced into numPartitions partitions.
coalesce(int, boolean, Option<PartitionCoalescer>, Ordering<T>) - Static method in class org.apache.spark.rdd.UnionRDD
 
coalesce(int) - Method in class org.apache.spark.sql.Dataset
Returns a new Dataset that has exactly numPartitions partitions, when the fewer partitions are requested.
coalesce(Column...) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null, or null if all inputs are null.
coalesce(Seq<Column>) - Static method in class org.apache.spark.sql.functions
Returns the first column that is not null, or null if all inputs are null.
coalesce$default$2() - Static method in class org.apache.spark.api.r.RRDD
 
coalesce$default$2() - Static method in class org.apache.spark.graphx.EdgeRDD
 
coalesce$default$2() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
coalesce$default$2() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
coalesce$default$2() - Static method in class org.apache.spark.graphx.VertexRDD
 
coalesce$default$2() - Static method in class org.apache.spark.rdd.HadoopRDD
 
coalesce$default$2() - Static method in class org.apache.spark.rdd.JdbcRDD
 
coalesce$default$2() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
coalesce$default$2() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
coalesce$default$2() - Static method in class org.apache.spark.rdd.UnionRDD
 
coalesce$default$3() - Static method in class org.apache.spark.api.r.RRDD
 
coalesce$default$3() - Static method in class org.apache.spark.graphx.EdgeRDD
 
coalesce$default$3() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
coalesce$default$3() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
coalesce$default$3() - Static method in class org.apache.spark.graphx.VertexRDD
 
coalesce$default$3() - Static method in class org.apache.spark.rdd.HadoopRDD
 
coalesce$default$3() - Static method in class org.apache.spark.rdd.JdbcRDD
 
coalesce$default$3() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
coalesce$default$3() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
coalesce$default$3() - Static method in class org.apache.spark.rdd.UnionRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.api.r.RRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.graphx.VertexRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
coalesce$default$4(int, boolean, Option<PartitionCoalescer>) - Static method in class org.apache.spark.rdd.UnionRDD
 
CoarseGrainedClusterMessages - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages() - Constructor for class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages
 
CoarseGrainedClusterMessages.AddWebUIFilter - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.AddWebUIFilter$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.GetExecutorLossReason - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.GetExecutorLossReason$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutorsOnHost - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillExecutorsOnHost$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.KillTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.LaunchTask$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterClusterManager - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterClusterManager$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisteredExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorFailed$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RegisterExecutorResponse - Interface in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RemoveExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RequestExecutors$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RetrieveLastAllocatedExecutorId$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.RetrieveSparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.ReviveOffers$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.SetupDriver - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.SetupDriver$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.Shutdown$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.SparkAppConfig - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.SparkAppConfig$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StatusUpdate$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopDriver$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutor$ - Class in org.apache.spark.scheduler.cluster
 
CoarseGrainedClusterMessages.StopExecutors$ - Class in org.apache.spark.scheduler.cluster
 
code() - Method in class org.apache.spark.mllib.feature.VocabWord
 
CodegenMetrics - Class in org.apache.spark.metrics.source
:: Experimental :: Metrics for code generation.
CodegenMetrics() - Constructor for class org.apache.spark.metrics.source.CodegenMetrics
 
codeLen() - Method in class org.apache.spark.mllib.feature.VocabWord
 
coefficientMatrix() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
coefficients() - Method in class org.apache.spark.ml.classification.LinearSVCModel
 
coefficients() - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
A vector of model coefficients for "binomial" logistic regression.
coefficients() - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
 
coefficients() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
 
coefficients() - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionTrainingSummary
Standard error of estimated coefficients and intercept.
coefficientStandardErrors() - Method in class org.apache.spark.ml.regression.LinearRegressionSummary
Standard error of estimated coefficients and intercept.
cogroup(JavaPairRDD<K, W>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(JavaPairRDD<K, W>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(JavaPairRDD<K, W1>, JavaPairRDD<K, W2>, JavaPairRDD<K, W3>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(RDD<Tuple2<K, W>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other, return a resulting RDD that contains a tuple with the list of values for that key in this as well as other.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2, return a resulting RDD that contains a tuple with the list of values for that key in this, other1 and other2.
cogroup(RDD<Tuple2<K, W1>>, RDD<Tuple2<K, W2>>, RDD<Tuple2<K, W3>>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
For each key k in this or other1 or other2 or other3, return a resulting RDD that contains a tuple with the list of values for that key in this, other1, other2 and other3.
cogroup(KeyValueGroupedDataset<K, U>, Function3<K, Iterator<V>, Iterator<U>, TraversableOnce<R>>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
(Scala-specific) Applies the given function to each cogrouped data.
cogroup(KeyValueGroupedDataset<K, U>, CoGroupFunction<K, V, U, R>, Encoder<R>) - Method in class org.apache.spark.sql.KeyValueGroupedDataset
(Java-specific) Applies the given function to each cogrouped data.
cogroup(JavaPairDStream<K, W>) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, int) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(JavaPairDStream<K, W>) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
cogroup(JavaPairDStream<K, W>, int) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
cogroup(JavaPairDStream<K, W>, Partitioner) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
cogroup(JavaPairDStream<K, W>) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
cogroup(JavaPairDStream<K, W>, int) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
cogroup(JavaPairDStream<K, W>, Partitioner) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
cogroup(DStream<Tuple2<K, W>>, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, int, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
cogroup(DStream<Tuple2<K, W>>, Partitioner, ClassTag<W>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream.
CoGroupedRDD<K> - Class in org.apache.spark.rdd
:: DeveloperApi :: An RDD that cogroups its parents.
CoGroupedRDD(Seq<RDD<? extends Product2<K, ?>>>, Partitioner, ClassTag<K>) - Constructor for class org.apache.spark.rdd.CoGroupedRDD
 
CoGroupFunction<K,V1,V2,R> - Interface in org.apache.spark.api.java.function
A function that returns zero or more output records from each grouping key and its values from 2 Datasets.
col(String) - Method in class org.apache.spark.sql.Dataset
Selects column based on the column name and return it as a Column.
col(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
coldStartStrategy() - Static method in class org.apache.spark.ml.recommendation.ALS
 
coldStartStrategy() - Static method in class org.apache.spark.ml.recommendation.ALSModel
 
colIter() - Method in class org.apache.spark.ml.linalg.DenseMatrix
 
colIter() - Method in interface org.apache.spark.ml.linalg.Matrix
Returns an iterator of column vectors.
colIter() - Method in class org.apache.spark.ml.linalg.SparseMatrix
 
colIter() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
colIter() - Method in interface org.apache.spark.mllib.linalg.Matrix
Returns an iterator of column vectors.
colIter() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
collect() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
collect() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
collect() - Static method in class org.apache.spark.api.java.JavaRDD
 
collect() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in this RDD.
collect() - Static method in class org.apache.spark.api.r.RRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.api.r.RRDD
 
collect() - Static method in class org.apache.spark.graphx.EdgeRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
collect() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
collect() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
collect() - Static method in class org.apache.spark.graphx.VertexRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.graphx.VertexRDD
 
collect() - Static method in class org.apache.spark.rdd.HadoopRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
collect() - Static method in class org.apache.spark.rdd.JdbcRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
collect() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
collect() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
collect() - Method in class org.apache.spark.rdd.RDD
Return an array that contains all of the elements in this RDD.
collect(PartialFunction<T, U>, ClassTag<U>) - Method in class org.apache.spark.rdd.RDD
Return an RDD that contains all matching values by applying f.
collect() - Static method in class org.apache.spark.rdd.UnionRDD
 
collect(PartialFunction<T, U>, ClassTag<U>) - Static method in class org.apache.spark.rdd.UnionRDD
 
collect() - Method in class org.apache.spark.sql.Dataset
Returns an array that contains all rows in this Dataset.
collect(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
collect(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
collect(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
collect(PartialFunction<A, B>, CanBuildFrom<Repr, B, That>) - Static method in class org.apache.spark.sql.types.StructType
 
collect_list(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns a list of objects with duplicates.
collect_list(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns a list of objects with duplicates.
collect_set(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns a set of objects with duplicate elements eliminated.
collect_set(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns a set of objects with duplicate elements eliminated.
collectAsList() - Method in class org.apache.spark.sql.Dataset
Returns a Java list that contains all rows in this Dataset.
collectAsMap() - Method in class org.apache.spark.api.java.JavaPairRDD
Return the key-value pairs in this RDD to the master as a Map.
collectAsMap() - Method in class org.apache.spark.rdd.PairRDDFunctions
Return the key-value pairs in this RDD to the master as a Map.
collectAsync() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
collectAsync() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
collectAsync() - Static method in class org.apache.spark.api.java.JavaRDD
 
collectAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of collect, which returns a future for retrieving an array containing all of the elements in this RDD.
collectAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for retrieving all elements of this RDD.
collectEdges(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Returns an RDD that contains for each vertex v its local edges, i.e., the edges that are incident on v, in the user-specified direction.
collectFirst(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
collectFirst(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
collectFirst(PartialFunction<BaseType, B>) - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
collectFirst(PartialFunction<A, B>) - Static method in class org.apache.spark.sql.types.StructType
 
collectionAccumulator() - Method in class org.apache.spark.SparkContext
Create and register a CollectionAccumulator, which starts with empty list and accumulates inputs by adding them into the list.
collectionAccumulator(String) - Method in class org.apache.spark.SparkContext
Create and register a CollectionAccumulator, which starts with empty list and accumulates inputs by adding them into the list.
CollectionAccumulator<T> - Class in org.apache.spark.util
An accumulator for collecting a list of elements.
CollectionAccumulator() - Constructor for class org.apache.spark.util.CollectionAccumulator
 
CollectionsUtils - Class in org.apache.spark.util
 
CollectionsUtils() - Constructor for class org.apache.spark.util.CollectionsUtils
 
collectLeaves() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
collectLeaves() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
collectLeaves() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
collectNeighborIds(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex ids for each vertex.
collectNeighbors(EdgeDirection) - Method in class org.apache.spark.graphx.GraphOps
Collect the neighbor vertex attributes for each vertex.
collectPartitions(int[]) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
collectPartitions(int[]) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
collectPartitions(int[]) - Static method in class org.apache.spark.api.java.JavaRDD
 
collectPartitions(int[]) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return an array that contains all of the elements in a specific partition of this RDD.
colPtrs() - Method in class org.apache.spark.ml.linalg.SparseMatrix
 
colPtrs() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
colsPerBlock() - Method in class org.apache.spark.mllib.linalg.distributed.BlockMatrix
 
colStats(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Computes column-wise summary statistics for the input RDD[Vector].
Column - Class in org.apache.spark.sql.catalog
A column in Spark, as returned by listColumns method in Catalog.
Column(String, String, String, boolean, boolean, boolean) - Constructor for class org.apache.spark.sql.catalog.Column
 
Column - Class in org.apache.spark.sql
A column that will be computed based on the data in a DataFrame.
Column(Expression) - Constructor for class org.apache.spark.sql.Column
 
Column(String) - Constructor for class org.apache.spark.sql.Column
 
column(String) - Static method in class org.apache.spark.sql.functions
Returns a Column based on the given column name.
ColumnName - Class in org.apache.spark.sql
A convenient class used for constructing schema.
ColumnName(String) - Constructor for class org.apache.spark.sql.ColumnName
 
ColumnPruner - Class in org.apache.spark.ml.feature
Utility transformer for removing temporary columns from a DataFrame.
ColumnPruner(String, Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
 
ColumnPruner(Set<String>) - Constructor for class org.apache.spark.ml.feature.ColumnPruner
 
columns() - Method in class org.apache.spark.sql.Dataset
Returns all column names as an array.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute all cosine similarities between columns of this matrix using the brute-force approach of computing normalized dot products.
columnSimilarities(double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Compute similarities between columns of this matrix using a sampling approach.
columnsToPrune() - Method in class org.apache.spark.ml.feature.ColumnPruner
 
combinations(int) - Static method in class org.apache.spark.sql.types.StructType
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the output RDD and uses map-side aggregation.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.api.java.JavaPairRDD
Simplified version of combineByKey that hash-partitions the resulting RDD using the existing partitioner/parallelism level and using map-side aggregation.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer) - Method in class org.apache.spark.rdd.PairRDDFunctions
Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Combine elements of each key in DStream's RDDs using custom function.
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
combineByKey(Function<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
combineByKey(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, ClassTag<C>) - Method in class org.apache.spark.streaming.dstream.PairDStreamFunctions
Combine elements of each key in DStream's RDDs using custom functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, Partitioner, boolean, Serializer, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Generic function to combine the elements for each key using a custom set of aggregation functions.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, int, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the output RDD.
combineByKeyWithClassTag(Function1<V, C>, Function2<C, V, C>, Function2<C, C, C>, ClassTag<C>) - Method in class org.apache.spark.rdd.PairRDDFunctions
:: Experimental :: Simplified version of combineByKeyWithClassTag that hash-partitions the resulting RDD using the existing partitioner/parallelism level.
combineCombinersByKey(Iterator<? extends Product2<K, C>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
combineValuesByKey(Iterator<? extends Product2<K, V>>, TaskContext) - Method in class org.apache.spark.Aggregator
 
commit(Function0<Parsers.Parser<T>>) - Static method in class org.apache.spark.ml.feature.RFormulaParser
 
commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.FileCommitProtocol
Commits a job after the writes succeed.
commitJob(JobContext, Seq<FileCommitProtocol.TaskCommitMessage>) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
 
commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.FileCommitProtocol
Commits a task after the writes succeed.
commitTask(TaskAttemptContext) - Method in class org.apache.spark.internal.io.HadoopMapReduceCommitProtocol
 
commitTask(OutputCommitter, TaskAttemptContext, int, int) - Static method in class org.apache.spark.mapred.SparkHadoopMapRedUtil
Commits a task output.
commonHeaderNodes() - Static method in class org.apache.spark.ui.UIUtils
 
companion() - Static method in class org.apache.spark.sql.types.StructType
 
compare(PartitionGroup, PartitionGroup) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
 
compare(Option<PartitionGroup>, Option<PartitionGroup>) - Method in class org.apache.spark.rdd.DefaultPartitionCoalescer
 
compare(Decimal) - Method in class org.apache.spark.sql.types.Decimal
 
compare(RDDInfo) - Method in class org.apache.spark.storage.RDDInfo
 
compareTo(A) - Static method in class org.apache.spark.sql.types.Decimal
 
compareTo(A) - Static method in class org.apache.spark.storage.RDDInfo
 
compareTo(SparkShutdownHook) - Method in class org.apache.spark.util.SparkShutdownHook
 
Complete() - Static method in class org.apache.spark.sql.streaming.OutputMode
OutputMode in which all the rows in the streaming DataFrame/Dataset will be written to the sink every time there are some updates.
completed() - Method in class org.apache.spark.status.api.v1.ApplicationAttemptInfo
 
completedIndices() - Method in class org.apache.spark.ui.jobs.UIData.StageUIData
 
completedJobs() - Method in class org.apache.spark.ui.jobs.JobProgressListener
Deprecated.
 
completedStageIndices() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
completedStages() - Method in class org.apache.spark.ui.jobs.JobProgressListener
Deprecated.
 
completedTasks() - Method in class org.apache.spark.status.api.v1.ExecutorSummary
 
completionTime() - Method in class org.apache.spark.scheduler.StageInfo
Time when all tasks in the stage completed or when the stage was cancelled.
completionTime() - Method in class org.apache.spark.status.api.v1.JobData
 
completionTime() - Method in class org.apache.spark.status.api.v1.StageData
 
completionTime() - Method in class org.apache.spark.ui.jobs.UIData.JobUIData
 
ComplexFutureAction<T> - Class in org.apache.spark
A FutureAction for actions that could trigger multiple Spark jobs.
ComplexFutureAction(Function1<JobSubmitter, Future<T>>) - Constructor for class org.apache.spark.ComplexFutureAction
 
compose(Function1<A, T1>) - Static method in class org.apache.spark.sql.types.StructType
 
compressed() - Static method in class org.apache.spark.ml.linalg.DenseMatrix
 
compressed() - Static method in class org.apache.spark.ml.linalg.DenseVector
 
compressed() - Method in interface org.apache.spark.ml.linalg.Matrix
Returns a matrix in dense column major, dense row major, sparse row major, or sparse column major format, whichever uses less storage.
compressed() - Static method in class org.apache.spark.ml.linalg.SparseMatrix
 
compressed() - Static method in class org.apache.spark.ml.linalg.SparseVector
 
compressed() - Method in interface org.apache.spark.ml.linalg.Vector
Returns a vector in either dense or sparse format, whichever uses less storage.
compressed() - Static method in class org.apache.spark.mllib.linalg.DenseVector
 
compressed() - Static method in class org.apache.spark.mllib.linalg.SparseVector
 
compressed() - Method in interface org.apache.spark.mllib.linalg.Vector
Returns a vector in either dense or sparse format, whichever uses less storage.
compressedColMajor() - Static method in class org.apache.spark.ml.linalg.DenseMatrix
 
compressedColMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
Returns a matrix in dense or sparse column major format, whichever uses less storage.
compressedColMajor() - Static method in class org.apache.spark.ml.linalg.SparseMatrix
 
compressedInputStream(InputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedInputStream(InputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedOutputStream(OutputStream) - Method in interface org.apache.spark.io.CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZ4CompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.LZFCompressionCodec
 
compressedOutputStream(OutputStream) - Method in class org.apache.spark.io.SnappyCompressionCodec
 
compressedRowMajor() - Static method in class org.apache.spark.ml.linalg.DenseMatrix
 
compressedRowMajor() - Method in interface org.apache.spark.ml.linalg.Matrix
Returns a matrix in dense or sparse row major format, whichever uses less storage.
compressedRowMajor() - Static method in class org.apache.spark.ml.linalg.SparseMatrix
 
CompressionCodec - Interface in org.apache.spark.io
:: DeveloperApi :: CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.
compute(Partition, TaskContext) - Method in class org.apache.spark.api.r.BaseRRDD
 
compute(Partition, TaskContext) - Static method in class org.apache.spark.api.r.RRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.EdgeRDD
 
compute(Partition, TaskContext) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
compute(Partition, TaskContext) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
compute(Partition, TaskContext) - Method in class org.apache.spark.graphx.VertexRDD
Provides the RDD[(VertexId, VD)] equivalent output.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point.
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.Gradient
Compute the gradient and loss given the features of a single data point, add the gradient to a provided vector to avoid creating new objects, and return loss.
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.HingeGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.L1Updater
 
compute(Vector, double, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LeastSquaresGradient
 
compute(Vector, double, Vector, Vector) - Method in class org.apache.spark.mllib.optimization.LogisticGradient
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SimpleUpdater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.SquaredL2Updater
 
compute(Vector, Vector, double, int, double) - Method in class org.apache.spark.mllib.optimization.Updater
Compute an updated value for weights given the gradient, stepSize, iteration number and regularization parameter.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.CoGroupedRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.HadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.JdbcRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.NewHadoopRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.PartitionPruningRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.RDD
:: DeveloperApi :: Implemented by subclasses to compute a given partition.
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.ShuffledRDD
 
compute(Partition, TaskContext) - Method in class org.apache.spark.rdd.UnionRDD
 
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaDStream
Generate an RDD for the given duration
compute(Time) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.api.java.JavaPairDStream
Method that generates an RDD for the given Duration
compute(Time) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
compute(Time) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
compute(Time) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.ConstantInputDStream
 
compute(Time) - Method in class org.apache.spark.streaming.dstream.DStream
Method that generates an RDD for the given time
compute(Time) - Method in class org.apache.spark.streaming.dstream.ReceiverInputDStream
 
computeColumnSummaryStatistics() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes column-wise summary statistics.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation for two datasets.
computeCorrelation(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation for two datasets.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrix(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
Compute Spearman's correlation matrix S, for the input matrix, where S(i, j) is the correlation between column i and j.
computeCorrelationMatrixFromCovariance(Matrix) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
Compute the Pearson correlation matrix from the covariance matrix.
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.PearsonCorrelation
 
computeCorrelationWithMatrixImpl(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.correlation.SpearmanCorrelation
 
computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
Computes the sum of squared distances between the input points and their corresponding cluster centers.
computeCost(Dataset<?>) - Method in class org.apache.spark.ml.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCost(Vector) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
Computes the squared distance between the input point and the cluster center it belongs to.
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
Computes the sum of squared distances between the input points and their corresponding cluster centers.
computeCost(JavaRDD<Vector>) - Method in class org.apache.spark.mllib.clustering.BisectingKMeansModel
Java-friendly version of computeCost().
computeCost(RDD<Vector>) - Method in class org.apache.spark.mllib.clustering.KMeansModel
Return the K-means cost (sum of squared distances of points to their nearest center) for this model on the given data.
computeCovariance() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the covariance matrix, treating each row as an observation.
computeError(RDD<LabeledPoint>, DecisionTreeRegressionModel[], double[], Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
Method to calculate error of the base learner for the gradient boosting calculation.
computeError(org.apache.spark.mllib.tree.model.TreeEnsembleModel, RDD<LabeledPoint>) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate error of the base learner for the gradient boosting calculation.
computeError(double, double) - Method in interface org.apache.spark.mllib.tree.loss.Loss
Method to calculate loss when the predictions are already known.
computeFractionForSampleSize(int, long, boolean) - Static method in class org.apache.spark.util.random.SamplingUtils
Returns a sampling rate that guarantees a sample of size greater than or equal to sampleSizeLowerBound 99.99% of the time.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the Gramian matrix A^T A.
computeGramianMatrix() - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the Gramian matrix A^T A.
computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeRegressionModel, Loss) - Static method in class org.apache.spark.ml.tree.impl.GradientBoostedTrees
Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.
computeInitialPredictionAndError(RDD<LabeledPoint>, double, DecisionTreeModel, Loss) - Static method in class org.apache.spark.mllib.tree.model.GradientBoostedTreesModel
:: DeveloperApi :: Compute the initial predictions and errors for a dataset for the first iteration of gradient boosting.
computePreferredLocations(Seq<InputFormatInfo>) - Static method in class org.apache.spark.scheduler.InputFormatInfo
Computes the preferred locations based on input(s) and returned a location to block map.
computePrincipalComponents(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components only.
computePrincipalComponentsAndExplainedVariance(int) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes the top k principal components and a vector of proportions of variance explained by each principal component.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.IndexedRowMatrix
Computes the singular value decomposition of this IndexedRowMatrix.
computeSVD(int, boolean, double) - Method in class org.apache.spark.mllib.linalg.distributed.RowMatrix
Computes singular value decomposition of this matrix.
computeThresholdByKey(Map<K, AcceptanceResult>, Map<K, Object>) - Static method in class org.apache.spark.util.random.StratifiedSamplingUtils
Given the result returned by getCounts, determine the threshold for accepting items to generate exact sample size.
concat(Column...) - Static method in class org.apache.spark.sql.functions
Concatenates multiple input string columns together into a single string column.
concat(Seq<Column>) - Static method in class org.apache.spark.sql.functions
Concatenates multiple input string columns together into a single string column.
concat_ws(String, Column...) - Static method in class org.apache.spark.sql.functions
Concatenates multiple input string columns together into a single string column, using the given separator.
concat_ws(String, Seq<Column>) - Static method in class org.apache.spark.sql.functions
Concatenates multiple input string columns together into a single string column, using the given separator.
Conf(int, int, double, double, double, double, double, double) - Constructor for class org.apache.spark.graphx.lib.SVDPlusPlus.Conf
 
conf() - Method in class org.apache.spark.SparkEnv
 
conf() - Method in class org.apache.spark.sql.hive.RelationConversions
 
conf() - Method in class org.apache.spark.sql.SparkSession
Runtime configuration interface for Spark.
confidence() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
Returns the confidence of the rule.
confidence() - Method in class org.apache.spark.partial.BoundedDouble
 
confidence() - Method in class org.apache.spark.util.sketch.CountMinSketch
Returns the confidence (or delta) of this CountMinSketch.
config(String, String) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a config option.
config(String, long) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a config option.
config(String, double) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a config option.
config(String, boolean) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a config option.
config(SparkConf) - Method in class org.apache.spark.sql.SparkSession.Builder
Sets a list of config options based on the given SparkConf.
config() - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
ConfigEntryWithDefault<T> - Class in org.apache.spark.internal.config
 
ConfigEntryWithDefault(String, T, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefault
 
ConfigEntryWithDefaultFunction<T> - Class in org.apache.spark.internal.config
 
ConfigEntryWithDefaultFunction(String, Function0<T>, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultFunction
 
ConfigEntryWithDefaultString<T> - Class in org.apache.spark.internal.config
 
ConfigEntryWithDefaultString(String, String, Function1<String, T>, Function1<T, String>, String, boolean) - Constructor for class org.apache.spark.internal.config.ConfigEntryWithDefaultString
 
ConfigHelpers - Class in org.apache.spark.internal.config
 
ConfigHelpers() - Constructor for class org.apache.spark.internal.config.ConfigHelpers
 
configTestLog4j(String) - Static method in class org.apache.spark.util.Utils
config a log4j properties used for testsuite
configuration() - Method in class org.apache.spark.scheduler.InputFormatInfo
 
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.HadoopRDD
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
CONFIGURATION_INSTANTIATION_LOCK() - Static method in class org.apache.spark.rdd.NewHadoopRDD
Configuration's constructor is not threadsafe (see SPARK-1097 and HADOOP-10456).
configureJobPropertiesForStorageHandler(TableDesc, Configuration, boolean) - Static method in class org.apache.spark.sql.hive.HiveTableUtil
 
confusionMatrix() - Method in class org.apache.spark.mllib.evaluation.MulticlassMetrics
Returns confusion matrix: predicted classes are in columns, they are ordered by class label ascending, as in "labels"
connect(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
connectedComponents() - Method in class org.apache.spark.graphx.GraphOps
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
connectedComponents(int) - Method in class org.apache.spark.graphx.GraphOps
Compute the connected component membership of each vertex and return a graph with the vertex value containing the lowest vertex id in the connected component containing that vertex.
ConnectedComponents - Class in org.apache.spark.graphx.lib
Connected components algorithm.
ConnectedComponents() - Constructor for class org.apache.spark.graphx.lib.ConnectedComponents
 
connectLeader(String, int) - Method in class org.apache.spark.streaming.kafka.KafkaCluster
 
consequent() - Method in class org.apache.spark.mllib.fpm.AssociationRules.Rule
 
ConstantInputDStream<T> - Class in org.apache.spark.streaming.dstream
An input stream that always returns the same RDD on each time step.
ConstantInputDStream(StreamingContext, RDD<T>, ClassTag<T>) - Constructor for class org.apache.spark.streaming.dstream.ConstantInputDStream
 
constraints() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
constraints() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
constraints() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
constructTree(org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData[]) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
Given a list of nodes from a tree, construct the tree.
constructTrees(RDD<org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0.NodeData>) - Method in class org.apache.spark.mllib.tree.model.DecisionTreeModel.SaveLoadV1_0$
 
constructURIForAuthentication(URI, org.apache.spark.SecurityManager) - Static method in class org.apache.spark.util.Utils
Construct a URI container information used for authentication.
contains(Param<?>) - Method in class org.apache.spark.ml.param.ParamMap
Checks whether a parameter is explicitly specified.
contains(String) - Method in class org.apache.spark.SparkConf
Does the configuration contain a given parameter?
contains(Object) - Method in class org.apache.spark.sql.Column
Contains the other element.
contains(String) - Method in class org.apache.spark.sql.types.Metadata
Tests whether this Metadata contains a binding for a key.
contains(A1) - Static method in class org.apache.spark.sql.types.StructType
 
containsBlock(BlockId) - Method in class org.apache.spark.storage.StorageStatus
Deprecated.
Return whether the given block is stored in this block manager in O(1) time.
containsChild() - Static method in class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
containsChild() - Static method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
containsChild() - Static method in class org.apache.spark.sql.hive.execution.ScriptTransformationExec
 
containsDelimiters() - Method in class org.apache.spark.sql.hive.execution.HiveOptions
 
containsNull() - Method in class org.apache.spark.sql.types.ArrayType
 
containsSlice(GenSeq<B>) - Static method in class org.apache.spark.sql.types.StructType
 
contentType() - Method in class org.apache.spark.ui.JettyUtils.ServletParams
 
context() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
context() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
context() - Static method in class org.apache.spark.api.java.JavaRDD
 
context() - Method in interface org.apache.spark.api.java.JavaRDDLike
The SparkContext that this RDD was created on.
context() - Static method in class org.apache.spark.api.r.RRDD
 
context() - Static method in class org.apache.spark.graphx.EdgeRDD
 
context() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
context() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
context() - Static method in class org.apache.spark.graphx.VertexRDD
 
context() - Method in class org.apache.spark.InterruptibleIterator
 
context(SQLContext) - Static method in class org.apache.spark.ml.r.RWrappers
 
context(SQLContext) - Method in class org.apache.spark.ml.util.MLReader
 
context(SQLContext) - Method in class org.apache.spark.ml.util.MLWriter
 
context() - Static method in class org.apache.spark.rdd.HadoopRDD
 
context() - Static method in class org.apache.spark.rdd.JdbcRDD
 
context() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
context() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
context() - Method in class org.apache.spark.rdd.RDD
The SparkContext that this RDD was created on.
context() - Static method in class org.apache.spark.rdd.UnionRDD
 
context() - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
context() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return the StreamingContext associated with this DStream
context() - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
context() - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
context() - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
context() - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
context() - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
context() - Method in class org.apache.spark.streaming.dstream.DStream
Return the StreamingContext associated with this DStream
Continuous() - Static method in class org.apache.spark.mllib.tree.configuration.FeatureType
 
ContinuousSplit - Class in org.apache.spark.ml.tree
Split which tests a continuous feature.
conv(Column, int, int) - Static method in class org.apache.spark.sql.functions
Convert a number in a string column from one base to another.
CONVERT_METASTORE_ORC() - Static method in class org.apache.spark.sql.hive.HiveUtils
 
CONVERT_METASTORE_PARQUET() - Static method in class org.apache.spark.sql.hive.HiveUtils
 
CONVERT_METASTORE_PARQUET_WITH_SCHEMA_MERGING() - Static method in class org.apache.spark.sql.hive.HiveUtils
 
convertMatrixColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts matrix columns in an input Dataset to the Matrix type from the new Matrix type under the spark.ml package.
convertMatrixColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts matrix columns in an input Dataset to the Matrix type from the new Matrix type under the spark.ml package.
convertMatrixColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts Matrix columns in an input Dataset from the Matrix type to the new Matrix type under the spark.ml package.
convertMatrixColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts Matrix columns in an input Dataset from the Matrix type to the new Matrix type under the spark.ml package.
convertToCanonicalEdges(Function2<ED, ED, ED>) - Method in class org.apache.spark.graphx.GraphOps
Convert bi-directional edges into uni-directional ones.
convertToTimeUnit(long, TimeUnit) - Static method in class org.apache.spark.streaming.ui.UIUtils
Convert milliseconds to the specified unit.
convertVectorColumnsFromML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts vector columns in an input Dataset to the Vector type from the new Vector type under the spark.ml package.
convertVectorColumnsFromML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts vector columns in an input Dataset to the Vector type from the new Vector type under the spark.ml package.
convertVectorColumnsToML(Dataset<?>, String...) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts vector columns in an input Dataset from the Vector type to the new Vector type under the spark.ml package.
convertVectorColumnsToML(Dataset<?>, Seq<String>) - Static method in class org.apache.spark.mllib.util.MLUtils
Converts vector columns in an input Dataset from the Vector type to the new Vector type under the spark.ml package.
CoordinateMatrix - Class in org.apache.spark.mllib.linalg.distributed
Represents a matrix in coordinate format.
CoordinateMatrix(RDD<MatrixEntry>, long, long) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
 
CoordinateMatrix(RDD<MatrixEntry>) - Constructor for class org.apache.spark.mllib.linalg.distributed.CoordinateMatrix
Alternative constructor leaving matrix dimensions to be determined automatically.
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassificationModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.DecisionTreeClassifier
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassificationModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.GBTClassifier
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVC
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LinearSVCModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegression
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.LogisticRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassificationModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.MultilayerPerceptronClassifier
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayes
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.NaiveBayesModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRest
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.OneVsRestModel
 
copy(ParamMap) - Static method in class org.apache.spark.ml.classification.ProbabilisticClassificationModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassificationModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.classification.RandomForestClassifier
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeans
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.BisectingKMeansModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.DistributedLDAModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixture
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.GaussianMixtureModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeans
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.KMeansModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LDA
 
copy(ParamMap) - Method in class org.apache.spark.ml.clustering.LocalLDAModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.Estimator
 
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.BinaryClassificationEvaluator
 
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.Evaluator
 
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.MulticlassClassificationEvaluator
 
copy(ParamMap) - Method in class org.apache.spark.ml.evaluation.RegressionEvaluator
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Binarizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSH
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.BucketedRandomProjectionLSHModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Bucketizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelector
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ChiSqSelectorModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ColumnPruner
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.CountVectorizerModel
 
copy(ParamMap) - Static method in class org.apache.spark.ml.feature.DCT
 
copy(ParamMap) - Static method in class org.apache.spark.ml.feature.ElementwiseProduct
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.HashingTF
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDF
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IDFModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Imputer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.ImputerModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.IndexToString
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Interaction
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScaler
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MaxAbsScalerModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSH
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinHashLSHModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScaler
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.MinMaxScalerModel
 
copy(ParamMap) - Static method in class org.apache.spark.ml.feature.NGram
 
copy(ParamMap) - Static method in class org.apache.spark.ml.feature.Normalizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.OneHotEncoder
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCA
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PCAModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.PolynomialExpansion
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.QuantileDiscretizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RegexTokenizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormula
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.RFormulaModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.SQLTransformer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScaler
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StandardScalerModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StopWordsRemover
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.StringIndexerModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Tokenizer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAssembler
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorAttributeRewriter
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorIndexerModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.VectorSlicer
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2Vec
 
copy(ParamMap) - Method in class org.apache.spark.ml.feature.Word2VecModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowth
 
copy(ParamMap) - Method in class org.apache.spark.ml.fpm.FPGrowthModel
 
copy(Vector, Vector) - Static method in class org.apache.spark.ml.linalg.BLAS
y = x
copy() - Method in class org.apache.spark.ml.linalg.DenseMatrix
 
copy() - Method in class org.apache.spark.ml.linalg.DenseVector
 
copy() - Method in interface org.apache.spark.ml.linalg.Matrix
Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.ml.linalg.SparseMatrix
 
copy() - Method in class org.apache.spark.ml.linalg.SparseVector
 
copy() - Method in interface org.apache.spark.ml.linalg.Vector
Makes a deep copy of this vector.
copy(ParamMap) - Method in class org.apache.spark.ml.Model
 
copy() - Method in class org.apache.spark.ml.param.ParamMap
Creates a copy of this param map.
copy(ParamMap) - Method in interface org.apache.spark.ml.param.Params
Creates a copy of this instance with the same UID and some extra params.
copy(ParamMap) - Method in class org.apache.spark.ml.Pipeline
 
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.PipelineStage
 
copy(ParamMap) - Method in class org.apache.spark.ml.Predictor
 
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALS
 
copy(ParamMap) - Method in class org.apache.spark.ml.recommendation.ALSModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegression
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.AFTSurvivalRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.DecisionTreeRegressor
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GBTRegressor
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegression
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.GeneralizedLinearRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegression
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.IsotonicRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegression
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.LinearRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressionModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.regression.RandomForestRegressor
 
copy(ParamMap) - Method in class org.apache.spark.ml.Transformer
 
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidator
 
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.CrossValidatorModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplit
 
copy(ParamMap) - Method in class org.apache.spark.ml.tuning.TrainValidationSplitModel
 
copy(ParamMap) - Method in class org.apache.spark.ml.UnaryTransformer
 
copy(Vector, Vector) - Static method in class org.apache.spark.mllib.linalg.BLAS
y = x
copy() - Method in class org.apache.spark.mllib.linalg.DenseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.DenseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Matrix
Get a deep copy of the matrix.
copy() - Method in class org.apache.spark.mllib.linalg.SparseMatrix
 
copy() - Method in class org.apache.spark.mllib.linalg.SparseVector
 
copy() - Method in interface org.apache.spark.mllib.linalg.Vector
Makes a deep copy of this vector.
copy() - Method in class org.apache.spark.mllib.random.ExponentialGenerator
 
copy() - Method in class org.apache.spark.mllib.random.GammaGenerator
 
copy() - Method in class org.apache.spark.mllib.random.LogNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.PoissonGenerator
 
copy() - Method in interface org.apache.spark.mllib.random.RandomDataGenerator
Returns a copy of the RandomDataGenerator with a new instance of the rng object used in the class when applicable for non-locking concurrent usage.
copy() - Method in class org.apache.spark.mllib.random.StandardNormalGenerator
 
copy() - Method in class org.apache.spark.mllib.random.UniformGenerator
 
copy() - Method in class org.apache.spark.mllib.random.WeibullGenerator
 
copy() - Method in class org.apache.spark.mllib.tree.configuration.Strategy
Returns a shallow copy of this instance.
copy(Kryo, T) - Static method in class org.apache.spark.serializer.JavaIterableWrapperSerializer
 
copy() - Method in interface org.apache.spark.sql.Row
Make a copy of the current Row object.
copy() - Method in class org.apache.spark.util.AccumulatorV2
Creates a new copy of this accumulator.
copy() - Method in class org.apache.spark.util.CollectionAccumulator
 
copy() - Method in class org.apache.spark.util.DoubleAccumulator
 
copy() - Method in class org.apache.spark.util.LegacyAccumulatorWrapper
 
copy() - Method in class org.apache.spark.util.LongAccumulator
 
copy() - Method in class org.apache.spark.util.StatCounter
Clone this StatCounter
copyAndReset() - Method in class org.apache.spark.util.AccumulatorV2
Creates a new copy of this accumulator, which is zero value.
copyAndReset() - Method in class org.apache.spark.util.CollectionAccumulator
 
copyFileStreamNIO(FileChannel, FileChannel, long, long) - Static method in class org.apache.spark.util.Utils
 
copyStream(InputStream, OutputStream, boolean, boolean) - Static method in class org.apache.spark.util.Utils
Copy all data from an InputStream to an OutputStream.
copyToArray(Object, int) - Static method in class org.apache.spark.sql.types.StructType
 
copyToArray(Object) - Static method in class org.apache.spark.sql.types.StructType
 
copyToArray(Object, int, int) - Static method in class org.apache.spark.sql.types.StructType
 
copyToBuffer(Buffer<B>) - Static method in class org.apache.spark.sql.types.StructType
 
copyValues(T, ParamMap) - Method in interface org.apache.spark.ml.param.Params
Copies param values from this instance to another instance for params shared by them.
cores() - Method in class org.apache.spark.scheduler.cluster.CoarseGrainedClusterMessages.RegisterExecutor
 
coresGranted() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
 
coresPerExecutor() - Method in class org.apache.spark.status.api.v1.ApplicationInfo
 
corr(Dataset<?>, String, String) - Static method in class org.apache.spark.ml.stat.Correlation
:: Experimental :: Compute the correlation matrix for the input Dataset of Vectors using the specified method.
corr(Dataset<?>, String) - Static method in class org.apache.spark.ml.stat.Correlation
Compute the Pearson correlation matrix for the input Dataset of Vectors.
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
corr(RDD<Vector>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation matrix for the input RDD of Vectors.
corr(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation matrix for the input RDD of Vectors using the specified method.
corr(RDD<Object>, RDD<Object>) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the Pearson correlation for the input RDDs.
corr(JavaRDD<Double>, JavaRDD<Double>) - Static method in class org.apache.spark.mllib.stat.Statistics
Java-friendly version of corr()
corr(RDD<Object>, RDD<Object>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Compute the correlation for the input RDDs using the specified method.
corr(JavaRDD<Double>, JavaRDD<Double>, String) - Static method in class org.apache.spark.mllib.stat.Statistics
Java-friendly version of corr()
corr(String, String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Calculates the correlation of two columns of a DataFrame.
corr(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Calculates the Pearson Correlation Coefficient of two columns of a DataFrame.
corr(Column, Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
corr(String, String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
Correlation - Class in org.apache.spark.ml.stat
API for correlation functions in MLlib, compatible with DataFrames and Datasets.
Correlation() - Constructor for class org.apache.spark.ml.stat.Correlation
 
CorrelationNames - Class in org.apache.spark.mllib.stat.correlation
Maintains supported and default correlation names.
CorrelationNames() - Constructor for class org.apache.spark.mllib.stat.correlation.CorrelationNames
 
Correlations - Class in org.apache.spark.mllib.stat.correlation
Delegates computation to the specific correlation object based on the input method name.
Correlations() - Constructor for class org.apache.spark.mllib.stat.correlation.Correlations
 
corresponds(GenSeq<B>, Function2<A, B, Object>) - Static method in class org.apache.spark.sql.types.StructType
 
corrMatrix(RDD<Vector>, String) - Static method in class org.apache.spark.mllib.stat.correlation.Correlations
 
cos(Column) - Static method in class org.apache.spark.sql.functions
Computes the cosine of the given value.
cos(String) - Static method in class org.apache.spark.sql.functions
Computes the cosine of the given column.
cosh(Column) - Static method in class org.apache.spark.sql.functions
Computes the hyperbolic cosine of the given value.
cosh(String) - Static method in class org.apache.spark.sql.functions
Computes the hyperbolic cosine of the given column.
count() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
count() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
count() - Static method in class org.apache.spark.api.java.JavaRDD
 
count() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the number of elements in the RDD.
count() - Static method in class org.apache.spark.api.r.RRDD
 
count() - Static method in class org.apache.spark.graphx.EdgeRDD
 
count() - Method in class org.apache.spark.graphx.impl.EdgeRDDImpl
The number of edges in the RDD.
count() - Method in class org.apache.spark.graphx.impl.VertexRDDImpl
The number of vertices in the RDD.
count() - Static method in class org.apache.spark.graphx.VertexRDD
 
count() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
 
count() - Method in class org.apache.spark.ml.regression.AFTAggregator
 
count() - Method in class org.apache.spark.ml.regression.LeastSquaresAggregator
 
count() - Method in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer
Sample size.
count() - Method in interface org.apache.spark.mllib.stat.MultivariateStatisticalSummary
Sample size.
count() - Static method in class org.apache.spark.rdd.HadoopRDD
 
count() - Static method in class org.apache.spark.rdd.JdbcRDD
 
count() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
count() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
count() - Method in class org.apache.spark.rdd.RDD
Return the number of elements in the RDD.
count() - Static method in class org.apache.spark.rdd.UnionRDD
 
count() - Method in class org.apache.spark.sql.Dataset
Returns the number of rows in the Dataset.
count(MapFunction<T, Object>) - Static method in class org.apache.spark.sql.expressions.javalang.typed
Count aggregate function.
count(Function1<IN, Object>) - Static method in class org.apache.spark.sql.expressions.scalalang.typed
Count aggregate function.
count(Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count(String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of items in a group.
count() - Method in class org.apache.spark.sql.KeyValueGroupedDataset
Returns a Dataset that contains a tuple with each key and the number of items present for that key.
count() - Method in class org.apache.spark.sql.RelationalGroupedDataset
Count the number of rows for each group.
count(Function1<A, Object>) - Static method in class org.apache.spark.sql.types.StructType
 
count() - Method in class org.apache.spark.storage.ReadableChannelFileRegion
 
count() - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
count() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
count() - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
count() - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
count() - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
count() - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
count() - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting each RDD of this DStream.
count() - Method in class org.apache.spark.streaming.kafka.OffsetRange
Number of messages this OffsetRange refers to
count() - Method in class org.apache.spark.util.DoubleAccumulator
Returns the number of elements added to the accumulator.
count() - Method in class org.apache.spark.util.LongAccumulator
Returns the number of elements added to the accumulator.
count() - Method in class org.apache.spark.util.StatCounter
 
countApprox(long, double) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countApprox(long) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countApprox(long, double) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countApprox(long) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countApprox(long, double) - Static method in class org.apache.spark.api.java.JavaRDD
 
countApprox(long) - Static method in class org.apache.spark.api.java.JavaRDD
 
countApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Static method in class org.apache.spark.api.r.RRDD
 
countApprox(long, double) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countApprox(long, double) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countApprox(long, double) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countApprox(long, double) - Static method in class org.apache.spark.graphx.VertexRDD
 
countApprox(long, double) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countApprox(long, double) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countApprox(long, double) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countApprox(long, double) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countApprox(long, double) - Method in class org.apache.spark.rdd.RDD
Approximate version of count() that returns a potentially incomplete result within a timeout, even if not all tasks have finished.
countApprox(long, double) - Static method in class org.apache.spark.rdd.UnionRDD
 
countApprox$default$2() - Static method in class org.apache.spark.api.r.RRDD
 
countApprox$default$2() - Static method in class org.apache.spark.graphx.EdgeRDD
 
countApprox$default$2() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countApprox$default$2() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countApprox$default$2() - Static method in class org.apache.spark.graphx.VertexRDD
 
countApprox$default$2() - Static method in class org.apache.spark.rdd.HadoopRDD
 
countApprox$default$2() - Static method in class org.apache.spark.rdd.JdbcRDD
 
countApprox$default$2() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countApprox$default$2() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countApprox$default$2() - Static method in class org.apache.spark.rdd.UnionRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.api.java.JavaRDD
 
countApproxDistinct(double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Static method in class org.apache.spark.api.r.RRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.api.r.RRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countApproxDistinct(double) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countApproxDistinct(double) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.graphx.VertexRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.graphx.VertexRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countApproxDistinct(int, int) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countApproxDistinct(int, int) - Method in class org.apache.spark.rdd.RDD
Return approximate number of distinct elements in the RDD.
countApproxDistinct(double) - Method in class org.apache.spark.rdd.RDD
Return approximate number of distinct elements in the RDD.
countApproxDistinct(int, int) - Static method in class org.apache.spark.rdd.UnionRDD
 
countApproxDistinct(double) - Static method in class org.apache.spark.rdd.UnionRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.api.r.RRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.graphx.EdgeRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.graphx.VertexRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.rdd.HadoopRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.rdd.JdbcRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countApproxDistinct$default$1() - Static method in class org.apache.spark.rdd.UnionRDD
 
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.api.java.JavaPairRDD
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(int, int, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, Partitioner) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double, int) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countApproxDistinctByKey(double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Return approximate number of distinct values for each key in this RDD.
countAsync() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countAsync() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countAsync() - Static method in class org.apache.spark.api.java.JavaRDD
 
countAsync() - Method in interface org.apache.spark.api.java.JavaRDDLike
The asynchronous version of count, which returns a future for counting the number of elements in this RDD.
countAsync() - Method in class org.apache.spark.rdd.AsyncRDDActions
Returns a future for counting the number of elements in the RDD.
countByKey() - Method in class org.apache.spark.api.java.JavaPairRDD
Count the number of elements for each key, and return the result to the master as a Map.
countByKey() - Method in class org.apache.spark.rdd.PairRDDFunctions
Count the number of elements for each key, collecting the results to a local Map.
countByKeyApprox(long) - Method in class org.apache.spark.api.java.JavaPairRDD
Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.api.java.JavaPairRDD
Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByKeyApprox(long, double) - Method in class org.apache.spark.rdd.PairRDDFunctions
Approximate version of countByKey that can return a partial result if it does not finish within a timeout.
countByValue() - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countByValue() - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countByValue() - Static method in class org.apache.spark.api.java.JavaRDD
 
countByValue() - Method in interface org.apache.spark.api.java.JavaRDDLike
Return the count of each unique value in this RDD as a map of (value, count) pairs.
countByValue(Ordering<T>) - Static method in class org.apache.spark.api.r.RRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.graphx.VertexRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countByValue(Ordering<T>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countByValue(Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Return the count of each unique value in this RDD as a local map of (value, count) pairs.
countByValue(Ordering<T>) - Static method in class org.apache.spark.rdd.UnionRDD
 
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
countByValue() - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue(int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
countByValue() - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
countByValue(int) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
countByValue(int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the counts of each distinct value in each RDD of this DStream.
countByValue$default$1() - Static method in class org.apache.spark.api.r.RRDD
 
countByValue$default$1() - Static method in class org.apache.spark.graphx.EdgeRDD
 
countByValue$default$1() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countByValue$default$1() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countByValue$default$1() - Static method in class org.apache.spark.graphx.VertexRDD
 
countByValue$default$1() - Static method in class org.apache.spark.rdd.HadoopRDD
 
countByValue$default$1() - Static method in class org.apache.spark.rdd.JdbcRDD
 
countByValue$default$1() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countByValue$default$1() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countByValue$default$1() - Static method in class org.apache.spark.rdd.UnionRDD
 
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
countByValueAndWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration, int) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
countByValueAndWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
countByValueAndWindow(Duration, Duration, int) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
countByValueAndWindow(Duration, Duration, int, Ordering<T>) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD contains the count of distinct elements in RDDs in a sliding window over this DStream.
countByValueApprox(long, double) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countByValueApprox(long) - Static method in class org.apache.spark.api.java.JavaDoubleRDD
 
countByValueApprox(long, double) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countByValueApprox(long) - Static method in class org.apache.spark.api.java.JavaPairRDD
 
countByValueApprox(long, double) - Static method in class org.apache.spark.api.java.JavaRDD
 
countByValueApprox(long) - Static method in class org.apache.spark.api.java.JavaRDD
 
countByValueApprox(long, double) - Method in interface org.apache.spark.api.java.JavaRDDLike
Approximate version of countByValue().
countByValueApprox(long) - Method in interface org.apache.spark.api.java.JavaRDDLike
Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.api.r.RRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.graphx.VertexRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countByValueApprox(long, double, Ordering<T>) - Method in class org.apache.spark.rdd.RDD
Approximate version of countByValue().
countByValueApprox(long, double, Ordering<T>) - Static method in class org.apache.spark.rdd.UnionRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.api.r.RRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.graphx.EdgeRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countByValueApprox$default$2() - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countByValueApprox$default$2() - Static method in class org.apache.spark.graphx.VertexRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.rdd.HadoopRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.rdd.JdbcRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countByValueApprox$default$2() - Static method in class org.apache.spark.rdd.UnionRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.api.r.RRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.graphx.EdgeRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.graphx.impl.EdgeRDDImpl
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.graphx.impl.VertexRDDImpl
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.graphx.VertexRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.rdd.HadoopRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.rdd.JdbcRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.rdd.NewHadoopRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
 
countByValueApprox$default$3(long, double) - Static method in class org.apache.spark.rdd.UnionRDD
 
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaDStream
 
countByWindow(Duration, Duration) - Method in interface org.apache.spark.streaming.api.java.JavaDStreamLike
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a window over this DStream.
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaInputDStream
 
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairDStream
 
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairInputDStream
 
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream
 
countByWindow(Duration, Duration) - Static method in class org.apache.spark.streaming.api.java.JavaReceiverInputDStream
 
countByWindow(Duration, Duration) - Method in class org.apache.spark.streaming.dstream.DStream
Return a new DStream in which each RDD has a single element generated by counting the number of elements in a sliding window over this DStream.
countDistinct(Column, Column...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, String...) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(Column, Seq<Column>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
countDistinct(String, Seq<String>) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the number of distinct items in a group.
CountingWritableChannel - Class in org.apache.spark.storage
 
CountingWritableChannel(WritableByteChannel) - Constructor for class org.apache.spark.storage.CountingWritableChannel
 
countMinSketch(String, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Count-min Sketch over a specified column.
countMinSketch(String, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Count-min Sketch over a specified column.
countMinSketch(Column, int, int, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Count-min Sketch over a specified column.
countMinSketch(Column, double, double, int) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Builds a Count-min Sketch over a specified column.
CountMinSketch - Class in org.apache.spark.util.sketch
A Count-min sketch is a probabilistic data structure used for cardinality estimation using sub-linear space.
CountMinSketch() - Constructor for class org.apache.spark.util.sketch.CountMinSketch
 
CountMinSketch.Version - Enum in org.apache.spark.util.sketch
 
countTowardsTaskFailures() - Static method in class org.apache.spark.ExceptionFailure
 
countTowardsTaskFailures() - Method in class org.apache.spark.ExecutorLostFailure
 
countTowardsTaskFailures() - Method in class org.apache.spark.FetchFailed
Fetch failures lead to a different failure handling path: (1) we don't abort the stage after 4 task failures, instead we immediately go back to the stage which generated the map output, and regenerate the missing data.
countTowardsTaskFailures() - Static method in class org.apache.spark.Resubmitted
 
countTowardsTaskFailures() - Method in class org.apache.spark.TaskCommitDenied
If a task failed because its attempt to commit was denied, do not count this failure towards failing the stage.
countTowardsTaskFailures() - Method in interface org.apache.spark.TaskFailedReason
Whether this task failure should be counted towards the maximum number of times the task is allowed to fail before the stage is aborted.
countTowardsTaskFailures() - Method in class org.apache.spark.TaskKilled
 
countTowardsTaskFailures() - Static method in class org.apache.spark.TaskResultLost
 
countTowardsTaskFailures() - Static method in class org.apache.spark.UnknownReason
 
CountVectorizer - Class in org.apache.spark.ml.feature
Extracts a vocabulary from document collections and generates a CountVectorizerModel.
CountVectorizer(String) - Constructor for class org.apache.spark.ml.feature.CountVectorizer
 
CountVectorizer() - Constructor for class org.apache.spark.ml.feature.CountVectorizer
 
CountVectorizerModel - Class in org.apache.spark.ml.feature
Converts a text document to a sparse vector of token counts.
CountVectorizerModel(String, String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
 
CountVectorizerModel(String[]) - Constructor for class org.apache.spark.ml.feature.CountVectorizerModel
 
cov() - Method in class org.apache.spark.ml.stat.distribution.MultivariateGaussian
 
cov(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Calculate the sample covariance of two numerical columns of a DataFrame.
covar_pop(Column, Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the population covariance for two columns.
covar_pop(String, String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the population covariance for two columns.
covar_samp(Column, Column) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sample covariance for two columns.
covar_samp(String, String) - Static method in class org.apache.spark.sql.functions
Aggregate function: returns the sample covariance for two columns.
covs() - Method in class org.apache.spark.ml.clustering.ExpectationAggregator
 
crc32(Column) - Static method in class org.apache.spark.sql.functions
Calculates the cyclic redundancy check value (CRC32) of a binary column and returns the value as a bigint.
CreatableRelationProvider - Interface in org.apache.spark.sql.sources
 
create(boolean, boolean, boolean, boolean, int) - Static method in class org.apache.spark.api.java.StorageLevels
Create a new StorageLevel object.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int, Function<ResultSet, T>) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes a SQL query on a JDBC connection and reads results.
create(JavaSparkContext, JdbcRDD.ConnectionFactory, String, long, long, int) - Static method in class org.apache.spark.rdd.JdbcRDD
Create an RDD that executes a SQL query on a JDBC connection and reads results.
create(RDD<T>, Function1<Object, Object>) - Static method in class org.apache.spark.rdd.PartitionPruningRDD
Create a PartitionPruningRDD.
create(Object...) - Static method in class org.apache.spark.sql.RowFactory
Create a Row from the given arguments.
create(String) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
Deprecated.
use Trigger.ProcessingTime(interval)
create(long, TimeUnit) - Static method in class org.apache.spark.sql.streaming.ProcessingTime
Deprecated.
use Trigger.ProcessingTime(interval, unit)
create(String, int) - Static method in class org.apache.spark.streaming.kafka.Broker
 
create(String, int, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
create(TopicAndPartition, long, long) - Static method in class org.apache.spark.streaming.kafka.OffsetRange
 
create(long) - Static method in class org.apache.spark.util.sketch.BloomFilter
Creates a BloomFilter with the expected number of insertions and a default expected false positive probability of 3%.
create(long, double) - Static method in class org.apache.spark.util.sketch.BloomFilter
Creates a BloomFilter with the expected number of insertions and expected false positive probability.
create(long, long) - Static method in class org.apache.spark.util.sketch.BloomFilter
Creates a BloomFilter with given expectedNumItems and numBits, it will pick an optimal numHashFunctions which can minimize fpp for the bloom filter.
create(int, int, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch
Creates a CountMinSketch with given depth, width, and random seed.
create(double, double, int) - Static method in class org.apache.spark.util.sketch.CountMinSketch
Creates a CountMinSketch with given relative error (eps), confidence, and random seed.
createArrayType(DataType) - Static method in class org.apache.spark.sql.types.DataTypes
Creates an ArrayType by specifying the data type of elements (elementType).
createArrayType(DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates an ArrayType by specifying the data type of elements (elementType) and whether the array contains null values (containsNull).
createCombiner() - Method in class org.apache.spark.Aggregator
 
createCompiledClass(String, File, TestUtils.JavaSourceFromString, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Creates a compiled class with the source file.
createCompiledClass(String, File, String, String, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Creates a compiled class with the given name.
createCryptoInputStream(InputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
Helper method to wrap InputStream with CryptoInputStream for decryption.
createCryptoOutputStream(OutputStream, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
Helper method to wrap OutputStream with CryptoOutputStream for encryption.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession
:: Experimental :: Creates a DataFrame from an RDD of Product (e.g.
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SparkSession
:: Experimental :: Creates a DataFrame from a local Seq of Product.
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
:: DeveloperApi :: Creates a DataFrame from an RDD containing Rows using the given schema.
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
:: DeveloperApi :: Creates a DataFrame from a JavaRDD containing Rows using the given schema.
createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SparkSession
:: DeveloperApi :: Creates a DataFrame from a java.util.List containing Rows using the given schema.
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
Applies a schema to an RDD of Java Beans.
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
Applies a schema to an RDD of Java Beans.
createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SparkSession
Applies a schema to a List of Java Beans.
createDataFrame(RDD<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(Seq<A>, TypeTags.TypeTag<A>) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(RDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(JavaRDD<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(List<Row>, StructType) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(RDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(JavaRDD<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
 
createDataFrame(List<?>, Class<?>) - Method in class org.apache.spark.sql.SQLContext
 
createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
:: Experimental :: Creates a Dataset from a local Seq of data of a given type.
createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
:: Experimental :: Creates a Dataset from an RDD of a given type.
createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SparkSession
:: Experimental :: Creates a Dataset from a java.util.List of a given type.
createDataset(Seq<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
 
createDataset(RDD<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
 
createDataset(List<T>, Encoder<T>) - Method in class org.apache.spark.sql.SQLContext
 
createDecimalType(int, int) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a DecimalType by specifying the precision and scale.
createDecimalType() - Static method in class org.apache.spark.sql.types.DataTypes
Creates a DecimalType with default precision and scale, which are 10 and 0.
createDF(RDD<byte[]>, StructType, SparkSession) - Static method in class org.apache.spark.sql.api.r.SQLUtils
 
createDirectory(String, String) - Static method in class org.apache.spark.util.Utils
Create a directory inside the given parent directory.
createDirectStream(StreamingContext, Map<String, String>, Map<TopicAndPartition, Object>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(StreamingContext, Map<String, String>, Set<String>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, Map<TopicAndPartition, Long>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createDirectStream(JavaStreamingContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, Set<String>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that directly pulls messages from Kafka Brokers without using any receiver.
createdTempDir() - Method in class org.apache.spark.sql.hive.execution.InsertIntoHiveTable
 
createExternalTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
Deprecated.
use createTable instead. Since 2.2.0.
createExternalTable(String, String) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, String) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createExternalTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.SQLContext
Deprecated.
use sparkSession.catalog.createTable instead. Since 2.2.0.
createFilter(StructType, Filter[]) - Static method in class org.apache.spark.sql.hive.orc.OrcFilters
 
createGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset
Creates a global temporary view using the given name.
CreateHiveTableAsSelectCommand - Class in org.apache.spark.sql.hive.execution
Create table and insert the query result into it.
CreateHiveTableAsSelectCommand(CatalogTable, LogicalPlan, SaveMode) - Constructor for class org.apache.spark.sql.hive.execution.CreateHiveTableAsSelectCommand
 
createJar(Seq<File>, File, Option<String>) - Static method in class org.apache.spark.TestUtils
Create a jar file that contains this set of files.
createJarWithClasses(Seq<String>, String, Seq<Tuple2<String, String>>, Seq<URL>) - Static method in class org.apache.spark.TestUtils
Create a jar that defines classes with the given names.
createJarWithFiles(Map<String, String>, File) - Static method in class org.apache.spark.TestUtils
Create a jar file containing multiple files.
createJobID(Date, int) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
 
createJobTrackerID(Date) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
 
createKey(SparkConf) - Static method in class org.apache.spark.security.CryptoStreamUtils
Creates a new encryption key.
createLogForDriver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
Create a WriteAheadLog for the driver.
createLogForReceiver(SparkConf, String, Configuration) - Static method in class org.apache.spark.streaming.util.WriteAheadLogUtils
Create a WriteAheadLog for the receiver.
createMapType(DataType, DataType) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a MapType by specifying the data type of keys (keyType) and values (keyType).
createMapType(DataType, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a MapType by specifying the data type of keys (keyType), the data type of values (keyType), and whether values contain any null value (valueContainsNull).
createOrReplaceGlobalTempView(String) - Method in class org.apache.spark.sql.Dataset
Creates or replaces a global temporary view using the given name.
createOrReplaceTempView(String) - Method in class org.apache.spark.sql.Dataset
Creates a local temporary view using the given name.
createOutputOperationFailureForUI(String) - Static method in class org.apache.spark.streaming.ui.UIUtils
 
createPathFromString(String, JobConf) - Static method in class org.apache.spark.internal.io.SparkHadoopWriterUtils
 
createPMMLModelExport(Object) - Static method in class org.apache.spark.mllib.pmml.export.PMMLModelExportFactory
Factory object to help creating the necessary PMMLModelExport implementation taking as input the machine learning model (for example KMeansModel).
createPollingStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(StreamingContext, Seq<InetSocketAddress>, StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createPollingStream(JavaStreamingContext, InetSocketAddress[], StorageLevel, int, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates an input stream that is to be used with the Spark Sink deployed on a Flume agent.
createProxyHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler for proxying request to Workers and Application Drivers
createProxyLocationHeader(String, String, HttpServletRequest, URI) - Static method in class org.apache.spark.ui.JettyUtils
 
createProxyURI(String, String, String, String) - Static method in class org.apache.spark.ui.JettyUtils
 
createRDD(SparkContext, Map<String, String>, OffsetRange[], ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an RDD from Kafka using offset ranges for each topic and partition.
createRDD(SparkContext, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function1<MessageAndMetadata<K, V>, R>, ClassTag<K>, ClassTag<V>, ClassTag<KD>, ClassTag<VD>, ClassTag<R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Map<String, String>, OffsetRange[]) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an RDD from Kafka using offset ranges for each topic and partition.
createRDD(JavaSparkContext, Class<K>, Class<V>, Class<KD>, Class<VD>, Class<R>, Map<String, String>, OffsetRange[], Map<TopicAndPartition, Broker>, Function<MessageAndMetadata<K, V>, R>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an RDD from Kafka using offset ranges for each topic and partition.
createRDDFromArray(JavaSparkContext, byte[][]) - Static method in class org.apache.spark.api.r.RRDD
Create an RRDD given a sequence of byte arrays.
createRDDFromFile(JavaSparkContext, String, int) - Static method in class org.apache.spark.api.r.RRDD
Create an RRDD given a temporary file name.
createReadableChannel(ReadableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
Wrap a ReadableByteChannel for decryption.
createRedirectHandler(String, String, Function1<HttpServletRequest, BoxedUnit>, String, Set<String>) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler that always redirects the user to the given path
createRelation(SQLContext, SaveMode, Map<String, String>, Dataset<Row>) - Method in interface org.apache.spark.sql.sources.CreatableRelationProvider
Saves a DataFrame to a destination (using data source-specific parameters)
createRelation(SQLContext, Map<String, String>) - Method in interface org.apache.spark.sql.sources.RelationProvider
Returns a new base relation with the given parameters.
createRelation(SQLContext, Map<String, String>, StructType) - Method in interface org.apache.spark.sql.sources.SchemaRelationProvider
Returns a new base relation with the given parameters and user defined schema.
createServlet(JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
 
createServletHandler(String, JettyUtils.ServletParams<T>, org.apache.spark.SecurityManager, SparkConf, String, Function1<T, Object>) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createServletHandler(String, HttpServlet, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a context handler that responds to a request with the given path prefix
createSink(SQLContext, Map<String, String>, Seq<String>, OutputMode) - Method in interface org.apache.spark.sql.sources.StreamSinkProvider
 
createSource(SQLContext, String, Option<StructType>, String, Map<String, String>) - Method in interface org.apache.spark.sql.sources.StreamSourceProvider
 
createSparkContext(String, String, String, String[], Map<Object, Object>, Map<Object, Object>) - Static method in class org.apache.spark.api.r.RRDD
 
createStaticHandler(String, String) - Static method in class org.apache.spark.ui.JettyUtils
Create a handler for serving files from a static directory
createStream(StreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(StreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Create a input stream from a Flume source.
createStream(JavaStreamingContext, String, int) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(JavaStreamingContext, String, int, StorageLevel, boolean) - Static method in class org.apache.spark.streaming.flume.FlumeUtils
Creates a input stream from a Flume source.
createStream(StreamingContext, String, String, Map<String, Object>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, Map<String, String>, Map<String, Object>, StorageLevel, ClassTag<K>, ClassTag<V>, ClassTag<U>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, String, String, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(JavaStreamingContext, Class<K>, Class<V>, Class<U>, Class<T>, Map<String, String>, Map<String, Integer>, StorageLevel) - Static method in class org.apache.spark.streaming.kafka.KafkaUtils
Create an input stream that pulls messages from Kafka Brokers.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function1<Record, T>, String, String, String, String, String, ClassTag<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(StreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, Function<Record, T>, Class<T>, String, String, String, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, InitialPositionInStream, Duration, StorageLevel, String, String) - Static method in class org.apache.spark.streaming.kinesis.KinesisUtils
Deprecated.
Use KinesisInputDStream.builder instead. Since 2.2.0.
createStream(JavaStreamingContext, String, String, String, String, int, Duration, StorageLevel, String, String, String, String, String) - Method in class org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper
 
createStructField(String, String, boolean) - Static method in class org.apache.spark.sql.api.r.SQLUtils
 
createStructField(String, DataType, boolean, Metadata) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructField by specifying the name (name), data type (dataType) and whether values of this field can be null values (nullable).
createStructField(String, DataType, boolean) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructField with empty metadata.
createStructType(Seq<StructField>) - Static method in class org.apache.spark.sql.api.r.SQLUtils
 
createStructType(List<StructField>) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructType with the given list of StructFields (fields).
createStructType(StructField[]) - Static method in class org.apache.spark.sql.types.DataTypes
Creates a StructType with the given StructField array (fields).
createTable(String, String) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: Creates a table from the given path and returns the corresponding DataFrame.
createTable(String, String, String) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: Creates a table from the given path based on a data source and returns the corresponding DataFrame.
createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: Creates a table based on the dataset in a data source and a set of options.
createTable(String, String, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: (Scala-specific) Creates a table based on the dataset in a data source and a set of options.
createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: Create a table based on the dataset in a data source, a schema and a set of options.
createTable(String, String, StructType, Map<String, String>) - Method in class org.apache.spark.sql.catalog.Catalog
:: Experimental :: (Scala-specific) Create a table based on the dataset in a data source, a schema and a set of options.
createTempDir(String, String) - Static method in class org.apache.spark.util.Utils
Create a temporary directory inside the given parent directory.
createTempView(String) - Method in class org.apache.spark.sql.Dataset
Creates a local temporary view using the given name.
createUnsafe(long, int, int) - Static method in class org.apache.spark.sql.types.Decimal
Creates a decimal from unscaled, precision and scale without checking the bounds.
createWorkspace(int) - Static method in class org.apache.spark.mllib.optimization.NNLS
 
createWritableChannel(WritableByteChannel, SparkConf, byte[]) - Static method in class org.apache.spark.security.CryptoStreamUtils
Wrap a WritableByteChannel for encryption.
crossJoin(Dataset<?>) - Method in class org.apache.spark.sql.Dataset
Explicit cartesian join with another DataFrame.
crosstab(String, String) - Method in class org.apache.spark.sql.DataFrameStatFunctions
Computes a pair-wise frequency table of the given columns.
CrossValidator - Class in org.apache.spark.ml.tuning
K-fold cross validation performs model selection by splitting the dataset into a set of non-overlapping randomly partitioned folds which are used as separate training and test datasets e.g., with k=3 folds, K-fold cross validation will generate 3 (training, test) dataset pairs, each of which uses 2/3 of the data for training and 1/3 for testing.
CrossValidator(String) - Constructor for class org.apache.spark.ml.tuning.CrossValidator
 
CrossValidator() - Constructor for class org.apache.spark.ml.tuning.CrossValidator
 
CrossValidatorModel - Class in org.apache.spark.ml.tuning
CrossValidatorModel contains the model with the highest average cross-validation metric across folds and uses this model to transform input data.
CryptoStreamUtils - Class in org.apache.spark.security
A util class for manipulating IO encryption and decryption streams.
CryptoStreamUtils() - Constructor for class org.apache.spark.security.CryptoStreamUtils
 
csv(String...) - Method in class org.apache.spark.sql.DataFrameReader
Loads CSV files and returns the result as a DataFrame.
csv(String) - Method in class org.apache.spark.sql.DataFrameReader
Loads a CSV file and returns the result as a DataFrame.