Search Java Classes and Packages

Search Java Frameworks and Libraries

255581 classes and counting ...
Search Tips Index Status

# Classes and Interfaces in #Apache Spark - 732 results found.
AbsoluteError Class for absolute error loss calculation (for regression).Classorg.apache.spark.mllib.tree.lossApache Spark
AccumulableA data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T.Classorg.apache.sparkApache Spark
AccumulableInfo Information about an Accumulable modified during a task or stage.Classorg.apache.spark.schedulerApache Spark
AccumulableInfoClassorg.apache.spark.status.api.v1Apache Spark
AccumulableParamHelper object defining how to accumulate values of a particular type.Interfaceorg.apache.sparkApache Spark
AccumulatorA simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i.Classorg.apache.sparkApache Spark
AccumulatorParamA simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value.Interfaceorg.apache.sparkApache Spark
ActorHelper A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed.Interfaceorg.apache.spark.streaming.receiverApache Spark
ActorSupervisorStrategyClassorg.apache.spark.streaming.receiverApache Spark Spark Spark
AFTSurvivalRegression Fit a parametric survival regression model named accelerated failure time (AFT) model ( Spark
AFTSurvivalRegressionModel Model produced by Spark
AggregatedDialectAggregatedDialect can unify multiple dialects into one virtual Dialect.Classorg.apache.spark.sql.jdbcApache Spark
AggregatingEdgeContextClassorg.apache.spark.graphx.implApache Spark
Aggregator A set of functions used to aggregate data.Classorg.apache.sparkApache Spark
AggregatorA base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value.Classorg.apache.spark.sql.expressionsApache Spark
Algo Enum to select the algorithm for the decision treeSee Also:Serialized FormClassorg.apache.spark.mllib.tree.configurationApache Spark
AlphaComponentA new component of Spark which may have unstable API's.Classorg.apache.spark.annotationApache Spark
ALS Alternating Least Squares (ALS) matrix Spark
ALSClassorg.apache.spark.mllib.recommendationApache Spark
ALSModel Model fitted by Spark
AnalysisException Thrown when a query fails to analyze, usually because the query itself is invalid.Classorg.apache.spark.sqlApache Spark
AndA filter that evaluates to true iff both left or right evaluate to true.Classorg.apache.spark.sql.sourcesApache Spark
ApplicationAttemptInfoClassorg.apache.spark.status.api.v1Apache Spark
ApplicationInfoClassorg.apache.spark.status.api.v1Apache Spark
ApplicationStatusenum ApplicationStatusEnum Constant SummaryClassorg.apache.spark.status.api.v1Apache Spark
ArrayTypeClassorg.apache.spark.sql.typesApache Spark
AskPermissionToCommitOutputClassorg.apache.spark.schedulerApache Spark
AssociationRules Generates association rules from a RDD[FreqItemset[Item].Classorg.apache.spark.mllib.fpmApache Spark
AsyncRDDActionsA set of asynchronous RDD actions available through an implicit conversion.Classorg.apache.spark.rddApache Spark
Attribute Abstract class for ML Spark
AttributeGroup Attributes that describe a vector ML Spark
AttributeType An enum-like type for attribute types: AttributeType$ Spark
BaseRelation Represents a collection of tuples with a known schema.Classorg.apache.spark.sql.sourcesApache Spark
BaseRRDDClassorg.apache.spark.api.rApache Spark
BatchInfo Class having information on completed batches.Classorg.apache.spark.streaming.schedulerApache Spark
BernoulliCellSampler A sampler based on Bernoulli trials for partitioning a data sequence.Classorg.apache.spark.util.randomApache Spark
BernoulliSampler A sampler based on Bernoulli trials.Classorg.apache.spark.util.randomApache Spark
Binarizer Binarize a column of continuous features given a Spark
BinaryAttribute A binary Spark
BinaryClassificationEvaluator Evaluator for binary classification, which expects two input columns: rawPrediction and Spark
BinaryClassificationMetricsEvaluator for binary classification.Classorg.apache.spark.mllib.evaluationApache Spark
BinaryLogisticRegressionSummary Binary Logistic regression results for a given Spark
BinaryLogisticRegressionTrainingSummary Logistic regression training Spark
BinarySampleClass that represents the group and value of a sample.Classorg.apache.spark.mllib.stat.testApache Spark
BinaryType The data type representing Array[Byte] values.Classorg.apache.spark.sql.typesApache Spark
BisectingKMeansA bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark.Classorg.apache.spark.mllib.clusteringApache Spark
BisectingKMeansModelClustering model produced by BisectingKMeans.Classorg.apache.spark.mllib.clusteringApache Spark
BlockId Identifies a particular Block of data, usually associated with a single file.Classorg.apache.spark.storageApache Spark
BlockManagerId This class represent an unique identifier for a BlockManager.Classorg.apache.spark.storageApache Spark
BlockMatrixRepresents a distributed matrix in blocks of local matrices.Classorg.apache.spark.mllib.linalg.distributedApache Spark
BlockNotFoundExceptionClassorg.apache.spark.storageApache Spark
BlockStatusClassorg.apache.spark.storageApache Spark
BlockUpdatedInfo Stores information about a block status in a block manager.Classorg.apache.spark.storageApache Spark
BooleanParam Specialized version of Param[Boolean] for Spark
BooleanType The data type representing Boolean values.Classorg.apache.spark.sql.typesApache Spark
BoostingStrategyConfiguration options for GradientBoostedTrees.Classorg.apache.spark.mllib.tree.configurationApache Spark
BoundedDoubleA Double value with error bars and associated confidence.Classorg.apache.spark.partialApache Spark
BroadcastA broadcast variable.Classorg.apache.spark.broadcastApache Spark
BroadcastBlockIdClassorg.apache.spark.storageApache Spark
BroadcastFactory An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations).Interfaceorg.apache.spark.broadcastApache Spark
BrokerRepresents the host and port info for a Kafka broker.Classorg.apache.spark.streaming.kafkaApache Spark
Bucketizer Bucketizer maps a column of continuous features to a column of feature Spark
BufferReleasingInputStreamHelper class that ensures a ManagedBuffer is release upon InputStream.Classorg.apache.spark.storageApache Spark
ByteType The data type representing Byte values.Classorg.apache.spark.sql.typesApache Spark
CalendarIntervalType The data type representing calendar time intervals.Classorg.apache.spark.sql.typesApache Spark
CatalystScan An interface for experimenting with a more direct connection to the query planner.Interfaceorg.apache.spark.sql.sourcesApache Spark
CategoricalSplit Split which tests a categorical Spark
ChiSqSelector Chi-Squared feature selection, which selects categorical features to use for predicting aSee Also:Serialized Spark
ChiSqSelectorClassorg.apache.spark.mllib.featureApache Spark Spark
ChiSqSelectorModelChi Squared selector model.Classorg.apache.spark.mllib.featureApache Spark
ChiSqTestResultObject containing the test results for the chi-squared hypothesis test.Classorg.apache.spark.mllib.stat.testApache Spark
ClassificationModel Model produced by a Spark
ClassificationModelRepresents a classification model that predicts to which of a set of categories an example belongs.Interfaceorg.apache.spark.mllib.classificationApache Spark
Classifier Single-label binary or multiclass Spark
CleanAccumClassorg.apache.sparkApache Spark
CleanBroadcastClassorg.apache.sparkApache Spark
CleanCheckpointClassorg.apache.sparkApache Spark
CleanRDDClassorg.apache.sparkApache Spark
CleanShuffleClassorg.apache.sparkApache Spark
CleanupTaskWeakReferenceA WeakReference associated with a CleanupTask.Classorg.apache.sparkApache Spark
CoGroupedRDD A RDD that cogroups its parents.Classorg.apache.spark.rddApache Spark Spark
Column A column that will be computed based on the data in a DataFrame.Classorg.apache.spark.sqlApache Spark
ColumnName A convenient class used for constructing schema.Classorg.apache.spark.sqlApache Spark
ColumnPrunerUtility transformer for removing temporary columns from a Spark
ComplexFutureActionA FutureAction for actions that could trigger multiple Spark jobs.Classorg.apache.sparkApache Spark
CompressionCodec CompressionCodec allows the customization of choosing different compression implementations to be used in block storage.Interfaceorg.apache.spark.ioApache Spark
ConnectedComponentsConnected components algorithm.Classorg.apache.spark.graphx.libApache Spark
ConstantInputDStreamAn input stream that always returns the same RDD on each timestep.Classorg.apache.spark.streaming.dstreamApache Spark
ContinuousSplit Split which tests a continuous Spark
CoordinateMatrixClassorg.apache.spark.mllib.linalg.distributedApache Spark
CountVectorizer Extracts a vocabulary from document collections and generates a Spark
CountVectorizerModel Converts a text document to a sparse vector of token Spark
CreatableRelationProviderInterfaceorg.apache.spark.sql.sourcesApache Spark
CrossValidator K-fold cross Spark
CrossValidatorModel Model from k-fold cross Spark
DataFrame A distributed collection of data organized into named columns.Classorg.apache.spark.sqlApache Spark
DataFrameHolderA container for a DataFrame, used for implicit conversions.Classorg.apache.spark.sqlApache Spark
DataFrameNaFunctions Functionality for working with missing data in DataFrames.Classorg.apache.spark.sqlApache Spark
DataFrameReader Interface used to load a DataFrame from external storage systems (e.Classorg.apache.spark.sqlApache Spark
DataFrameStatFunctions Statistic functions for DataFrames.Classorg.apache.spark.sqlApache Spark
DataFrameWriter Interface used to write a DataFrame to external storage systems (e.Classorg.apache.spark.sqlApache Spark
Dataset A Dataset is a strongly typed collection of objects that can be transformed in parallel using functional or relational operations.Classorg.apache.spark.sqlApache Spark
DatasetHolderA container for a Dataset, used for implicit conversions.Classorg.apache.spark.sqlApache Spark
DataSourceRegister Data sources should implement this trait so that they can register an alias to their data source.Interfaceorg.apache.spark.sql.sourcesApache Spark
DataType The base type of all Spark SQL data types.Classorg.apache.spark.sql.typesApache Spark
DataTypesTo get/create specific data type, users should use singleton objects and factory methods provided by this class.Classorg.apache.spark.sql.typesApache Spark
DataValidators A collection of methods used to validate data before applying ML algorithms.Classorg.apache.spark.mllib.utilApache Spark
DateType A date type, supporting "0001-01-01" through "9999-12-31".Classorg.apache.spark.sql.typesApache Spark
DB2DialectClassorg.apache.spark.sql.jdbcApache Spark
DCT A feature transformer that takes the 1D discrete cosine transform of a real Spark
DecimalA mutable implementation of BigDecimal that can hold a Long if values are small enough.Classorg.apache.spark.sql.typesApache Spark
DecimalTypeClassorg.apache.spark.sql.typesApache Spark
DecisionTreeA class which implements a decision tree learning algorithm for classification and regression.Classorg.apache.spark.mllib.treeApache Spark
DecisionTreeClassificationModel Decision tree model for Spark
DecisionTreeClassifier Decision tree learning algorithm for Spark
DecisionTreeModelDecision tree model for classification or regression.Classorg.apache.spark.mllib.tree.modelApache Spark
DecisionTreeRegressionModel Decision tree model for Spark
DecisionTreeRegressor Decision tree learning algorithm It supports both continuous and categorical Spark
DefaultSourcelibsvm package implements Spark SQL data source API for loading LIBSVM data as Spark
DenseMatrixColumn-major dense matrix.Classorg.apache.spark.mllib.linalgApache Spark
DenseVectorA dense vector represented by a value array.Classorg.apache.spark.mllib.linalgApache Spark
Dependency Base class for dependencies.Classorg.apache.sparkApache Spark
DerbyDialectClassorg.apache.spark.sql.jdbcApache Spark
DeserializationStream A stream for reading serialized objects.Classorg.apache.spark.serializerApache Spark
DeveloperApiA lower-level, unstable API intended for developers.Classorg.apache.spark.annotationApache Spark
DistributedLDAModel Distributed model fitted by Spark
DistributedLDAModelClassorg.apache.spark.mllib.clusteringApache Spark
DistributedMatrixRepresents a distributively stored matrix backed by one or more RDDs.Interfaceorg.apache.spark.mllib.linalg.distributedApache Spark
DoubleArrayParam Specialized version of Param[Array[Double} for Spark
DoubleFlatMapFunctionA function that returns zero or more records of type Double from each input Spark
DoubleFunctionA function that returns Doubles, and can be used to construct Spark
DoubleParam Specialized version of Param[Double] for Spark
DoubleRDDFunctionsExtra functions available on RDDs of Doubles through an implicit conversion.Classorg.apache.spark.rddApache Spark
DoubleType The data type representing Double values.Classorg.apache.spark.sql.typesApache Spark
DStreamA Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (seeClassorg.apache.spark.streaming.dstreamApache Spark
DummySerializerInstanceUnfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter.Classorg.apache.spark.serializerApache Spark
DurationClassorg.apache.spark.streamingApache Spark
DurationsClassorg.apache.spark.streamingApache Spark
EdgeA single directed edge consisting of a source id, target id, and the data associated with the edge.Classorg.apache.spark.graphxApache Spark
EdgeActivenessCriteria for filtering edges based on activeness.Classorg.apache.spark.graphx.implApache Spark
EdgeContextRepresents an edge along with its neighboring vertices and allows sending messages along the edge.Classorg.apache.spark.graphxApache Spark
EdgeDirectionThe direction of a directed edge relative to a vertex.Classorg.apache.spark.graphxApache Spark
EdgeRDDEdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance.Classorg.apache.spark.graphxApache Spark
EdgeRDDImplClassorg.apache.spark.graphx.implApache Spark
EdgeTripletAn edge triplet represents an edge along with the vertex attributes of its neighboring vertices.Classorg.apache.spark.graphxApache Spark
ElementwiseProduct Outputs the Hadamard product ( Spark
ElementwiseProductOutputs the Hadamard product (i.Classorg.apache.spark.mllib.featureApache Spark
EMLDAOptimizer Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters.Classorg.apache.spark.mllib.clusteringApache Spark
Encoder Used to convert a JVM object of type T to and from the internal Spark SQL representation.Interfaceorg.apache.spark.sqlApache Spark
Encoders Methods for creating an Encoder.Classorg.apache.spark.sqlApache Spark
Entropy Class for calculating entropy during binary classification.Classorg.apache.spark.mllib.tree.impurityApache Spark
EnumUtilClassorg.apache.spark.utilApache Spark
EnvironmentListenerClassorg.apache.spark.ui.envApache Spark
EqualNullSafePerforms equality comparison, similar to EqualTo.Classorg.apache.spark.sql.sourcesApache Spark
EqualToA filter that evaluates to true iff the attribute evaluates to a valueSince:1.Classorg.apache.spark.sql.sourcesApache Spark
Estimator Abstract class for estimators that fit models to data.Classorg.apache.spark.mlApache Spark
Evaluator Abstract class for evaluators that compute metrics from Spark
ExceptionFailure Task failed due to a runtime exception.Classorg.apache.sparkApache Spark
ExecutionListenerManager Manager for QueryExecutionListener.Classorg.apache.spark.sql.utilApache Spark
ExecutorInfo Stores information about an executor to pass from the scheduler to SparkListeners.Classorg.apache.spark.scheduler.clusterApache Spark
ExecutorLostFailure The task failed because the executor that it was running on was lost.Classorg.apache.sparkApache Spark
ExecutorRegisteredClassorg.apache.sparkApache Spark
ExecutorRemovedClassorg.apache.sparkApache Spark
ExecutorsListenerClassorg.apache.spark.ui.execApache Spark
ExecutorStageSummaryClassorg.apache.spark.status.api.v1Apache Spark
ExecutorSummaryClassorg.apache.spark.status.api.v1Apache Spark
ExpectationSumClassorg.apache.spark.mllib.clusteringApache Spark
ExperimentalAn experimental user-facing API.Classorg.apache.spark.annotationApache Spark
ExperimentalMethods Holder for experimental methods for the bravest.Classorg.apache.spark.sqlApache Spark
ExponentialGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
FeatureTypeEnum to describe whether a feature is "continuous" or "categorical"See Also:Serialized FormClassorg.apache.spark.mllib.tree.configurationApache Spark
FetchFailed Task failed to fetch shuffle data from a remote node.Classorg.apache.sparkApache Spark
FilterA filter predicate for data sources.Classorg.apache.spark.sql.sourcesApache Spark
FilterFunctionBase interface for a function used in Dataset's filter Spark
FlatMapFunctionA function that returns zero or more output records from each input Spark
FlatMapFunction2A function that takes two inputs and returns zero or more output Spark
FlatMapGroupsFunctionA function that returns zero or more output records from each grouping key and its Spark
FloatParam Specialized version of Param[Float] for Spark
FloatType The data type representing Float values.Classorg.apache.spark.sql.typesApache Spark
FlumeUtilsClassorg.apache.spark.streaming.flumeApache Spark
ForeachFunctionBase interface for a function used in Dataset's foreach Spark
ForeachPartitionFunctionBase interface for a function used in Dataset's foreachPartition Spark
FPGrowthA parallel FP-growth algorithm to mine frequent itemsets.Classorg.apache.spark.mllib.fpmApache Spark
FPGrowthModelModel trained by FPGrowth, which holds frequent itemsets.Classorg.apache.spark.mllib.fpmApache Spark
FunctionBase interface for functions whose return types do not create special Spark
Function2A two-argument function that takes arguments of type T1 and T2 and returns an Spark
Function3A three-argument function that takes arguments of type T1, T2 and T3 and returns an Spark
Function4A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an Spark
functionsClassorg.apache.spark.sqlApache Spark
FutureActionA future for the result of an action to support cancellation.Interfaceorg.apache.sparkApache Spark
GammaGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
GaussianMixtureThis class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs).Classorg.apache.spark.mllib.clusteringApache Spark
GaussianMixtureModelMultivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1.Classorg.apache.spark.mllib.clusteringApache Spark
GBTClassificationModel Gradient-Boosted Trees (GBTs) model for Spark
GBTClassifier Gradient-Boosted Trees (GBTs) learning algorithm for Spark
GBTRegressionModel Gradient-Boosted Trees (GBTs) model for Spark
GBTRegressor Gradient-Boosted Trees (GBTs) learning algorithm for Spark
GeneralizedLinearAlgorithm GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM).Classorg.apache.spark.mllib.regressionApache Spark
GeneralizedLinearModel GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm.Classorg.apache.spark.mllib.regressionApache Spark
Gini Class for calculating the during binary classification.Classorg.apache.spark.mllib.tree.impurityApache Spark
Gradient Class used to compute the gradient for a loss function, given a single data point.Classorg.apache.spark.mllib.optimizationApache Spark
GradientBoostedTreesA class that implements Stochastic Gradient BoostingClassorg.apache.spark.mllib.treeApache Spark
GradientBoostedTreesModelRepresents a gradient boosted trees model.Classorg.apache.spark.mllib.tree.modelApache Spark
GradientDescentClass used to solve an optimization problem using Gradient Descent.Classorg.apache.spark.mllib.optimizationApache Spark
GraphThe Graph abstractly represents a graph with arbitrary objects associated with vertices and edges.Classorg.apache.spark.graphxApache Spark
GraphGeneratorsA collection of graph generating functions.Classorg.apache.spark.graphx.utilApache Spark
GraphImplAn implementation of Graph to support computation on graphs.Classorg.apache.spark.graphx.implApache Spark
GraphKryoRegistratorRegisters GraphX classes with Kryo for improved performance.Classorg.apache.spark.graphxApache Spark
GraphLoaderProvides utilities for loading Graphs from files.Classorg.apache.spark.graphxApache Spark
GraphOpsContains additional functionality for Graph.Classorg.apache.spark.graphxApache Spark
GraphXUtilsClassorg.apache.spark.graphxApache Spark
GreaterThanA filter that evaluates to true iff the attribute evaluates to a value greater than value.Classorg.apache.spark.sql.sourcesApache Spark
GreaterThanOrEqualA filter that evaluates to true iff the attribute evaluates to a value greater than or equal to value.Classorg.apache.spark.sql.sourcesApache Spark
GroupedData A set of methods for aggregations on a DataFrame, created by DataFrame.Classorg.apache.spark.sqlApache Spark
GroupedDataset A Dataset has been logically grouped by a user specified grouping key.Classorg.apache.spark.sqlApache Spark
HadoopFsRelation A BaseRelation that provides much of the common code required for relations that store their data to an HDFS compatible filesystem.Classorg.apache.spark.sql.sourcesApache Spark
HadoopFsRelationProvider Implemented by objects that produce relations for a specific kind of data source with a given schema and partitioned columns.Interfaceorg.apache.spark.sql.sourcesApache Spark
HadoopRDD An RDD that provides core functionality for reading data stored in Hadoop (e.Classorg.apache.spark.rddApache Spark
HashingTF Maps a sequence of terms to their term frequencies using the hashing Spark
HashingTFMaps a sequence of terms to their term frequencies using the hashing trick.Classorg.apache.spark.mllib.featureApache Spark
HashPartitionerA Partitioner that implements hash-based partitioning using Java's Object.Classorg.apache.sparkApache Spark
HasOffsetRangesRepresents any object that has a collection of OffsetRanges.Interfaceorg.apache.spark.streaming.kafkaApache Spark
HingeGradient Compute gradient and loss for a Hinge loss function, as used in SVM binary classification.Classorg.apache.spark.mllib.optimizationApache Spark
HiveContextAn instance of the Spark SQL execution engine that integrates with data stored in Hive.Classorg.apache.spark.sql.hiveApache Spark
HttpBroadcastFactoryA BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism.Classorg.apache.spark.broadcastApache Spark
Identifiable Trait for an object with an immutable unique ID that identifies itself and its Spark
IDF Compute the Inverse Document Frequency (IDF) given a collection of Spark
IDFInverse document frequency (IDF).Classorg.apache.spark.mllib.featureApache Spark Spark
IDFModelRepresents an IDF model that can transform term frequency vectors.Classorg.apache.spark.mllib.featureApache Spark
Impurity Trait for calculating information gain.Interfaceorg.apache.spark.mllib.tree.impurityApache Spark
InA filter that evaluates to true iff the attribute evaluates to one of the values in the array.Classorg.apache.spark.sql.sourcesApache Spark
IndexedRowRepresents a row of IndexedRowMatrix.Classorg.apache.spark.mllib.linalg.distributedApache Spark
IndexedRowMatrixClassorg.apache.spark.mllib.linalg.distributedApache Spark Spark
InformationGainStats Information gain statistics for each split param: gain information gain valueClassorg.apache.spark.mllib.tree.modelApache Spark
InnerClosureFinderClassorg.apache.spark.utilApache Spark
InputDStreamThis is the abstract base class for all input streams.Classorg.apache.spark.streaming.dstreamApache Spark
InputFormatInfo Parses and holds information about inputFormat (and files) specified as a parameter.Classorg.apache.spark.schedulerApache Spark
InputMetricDistributionsClassorg.apache.spark.status.api.v1Apache Spark
InputMetricsClassorg.apache.spark.status.api.v1Apache Spark
InsertableRelation A BaseRelation that can be used to insert data into it through the insert method.Interfaceorg.apache.spark.sql.sourcesApache Spark
IntArrayParam Specialized version of Param[Array[Int} for Spark
IntegerType The data type representing Int values.Classorg.apache.spark.sql.typesApache Spark
Interaction Implements the feature interaction Spark
InternalNode Internal Decision Tree Spark
InterruptibleIterator An iterator that wraps around an existing iterator to provide task killing functionality.Classorg.apache.sparkApache Spark
IntParam Specialized version of Param[Int] for Spark
IsNotNullA filter that evaluates to true iff the attribute evaluates to a non-null value.Classorg.apache.spark.sql.sourcesApache Spark
IsNullA filter that evaluates to true iff the attribute evaluates to null.Classorg.apache.spark.sql.sourcesApache Spark Spark
IsotonicRegressionClassorg.apache.spark.mllib.regressionApache Spark
IsotonicRegressionModel Model fitted by Spark
IsotonicRegressionModelRegression model for isotonic regression.Classorg.apache.spark.mllib.regressionApache Spark
JavaDoubleRDDClassorg.apache.spark.api.javaApache Spark
JavaDStreamA Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data.Classorg.apache.spark.streaming.api.javaApache Spark
JavaDStreamLikeInterfaceorg.apache.spark.streaming.api.javaApache Spark
JavaFutureActionInterfaceorg.apache.spark.api.javaApache Spark
JavaHadoopRDDClassorg.apache.spark.api.javaApache Spark
JavaInputDStreamA Java-friendly interface to InputDStream.Classorg.apache.spark.streaming.api.javaApache Spark
JavaIterableWrapperSerializerA Kryo serializer for serializing results returned by asJavaIterable.Classorg.apache.spark.serializerApache Spark
JavaMapWithStateDStream DStream representing the stream of data generated by mapWithState operation on a JavaPairDStream.Classorg.apache.spark.streaming.api.javaApache Spark
JavaNewHadoopRDDClassorg.apache.spark.api.javaApache Spark
JavaPairDStreamA Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join.Classorg.apache.spark.streaming.api.javaApache Spark
JavaPairInputDStreamA Java-friendly interface to InputDStream ofSee Also:Serialized FormClassorg.apache.spark.streaming.api.javaApache Spark
JavaPairRDDClassorg.apache.spark.api.javaApache Spark
JavaPairReceiverInputDStreamA Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.Classorg.apache.spark.streaming.api.javaApache Spark
JavaParams Java-friendly wrapper for Spark
JavaRDDClassorg.apache.spark.api.javaApache Spark
JavaRDDLikeDefines operations common to several Java RDD implementations.Interfaceorg.apache.spark.api.javaApache Spark
JavaReceiverInputDStreamA Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network.Classorg.apache.spark.streaming.api.javaApache Spark
JavaSerializer A Spark serializer that uses Java's built-in serialization.Classorg.apache.spark.serializerApache Spark
JavaSparkContextA Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones.Classorg.apache.spark.api.javaApache Spark
JavaSparkListenerJava clients should extend this class instead of implementing SparkListener directly.Classorg.apache.sparkApache Spark
JavaSparkStatusTrackerLow-level status reporting APIs for monitoring job and stage progress.Classorg.apache.spark.api.javaApache Spark
JavaStreamingContextA Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality.Classorg.apache.spark.streaming.api.javaApache Spark
JdbcDialect Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver.Classorg.apache.spark.sql.jdbcApache Spark
JdbcDialects Registry of dialects that apply to every new jdbc DataFrame.Classorg.apache.spark.sql.jdbcApache Spark
JdbcRDDAn RDD that executes an SQL query on a JDBC connection and reads results.Classorg.apache.spark.rddApache Spark
JdbcType A database type definition coupled with the jdbc type needed to send null values to the database.Classorg.apache.spark.sql.jdbcApache Spark
JobDataClassorg.apache.spark.status.api.v1Apache Spark
JobExecutionStatusenum JobExecutionStatusEnum Constant SummaryClassorg.apache.sparkApache Spark
JobLogger A logger class to record runtime information for jobs in Spark.Classorg.apache.spark.schedulerApache Spark
JobProgressListener Tracks task-level information to be displayed in the UI.Classorg.apache.spark.ui.jobsApache Spark
JobSucceededClassorg.apache.spark.schedulerApache Spark
KafkaUtilsClassorg.apache.spark.streaming.kafkaApache Spark
KernelDensityKernel density estimation.Classorg.apache.spark.mllib.statApache Spark
KillTaskClassorg.apache.spark.scheduler.localApache Spark
KinesisUtilsClassorg.apache.spark.streaming.kinesisApache Spark
KinesisUtilsPythonHelperThis is a helper class that wraps the methods in KinesisUtils into more Python-friendly class and function so that it can be easily instantiated and called from Python's KinesisUtils.Classorg.apache.spark.streaming.kinesisApache Spark Spark Spark
KMeansK-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-meansalgorithm by Bahmani et al).Classorg.apache.spark.mllib.clusteringApache Spark
KMeansDataGenerator Generate test data for KMeans.Classorg.apache.spark.mllib.utilApache Spark
KMeansModel Model fitted by Spark
KMeansModelA clustering model for K-means.Classorg.apache.spark.mllib.clusteringApache Spark
KolmogorovSmirnovTestResult Object containing the test results for the Kolmogorov-Smirnov test.Classorg.apache.spark.mllib.stat.testApache Spark
KryoRegistratorInterfaceorg.apache.spark.serializerApache Spark
KryoSerializerA Spark serializer that uses the Kryo serialization library.Classorg.apache.spark.serializerApache Spark
L1Updater Updater for L1 regularized problems.Classorg.apache.spark.mllib.optimizationApache Spark
LabelConverterLabel to vector Spark
LabeledPointClass that represents the features and labels of a data point.Classorg.apache.spark.mllib.regressionApache Spark
LabelPropagationLabel Propagation algorithm.Classorg.apache.spark.graphx.libApache Spark
LassoModelRegression model trained using Lasso.Classorg.apache.spark.mllib.regressionApache Spark
LassoWithSGDTrain a regression model with L1-regularization using Stochastic Gradient Descent.Classorg.apache.spark.mllib.regressionApache Spark
LBFGS Class used to solve an optimization problem using Limited-memory BFGS.Classorg.apache.spark.mllib.optimizationApache Spark
LDA Latent Dirichlet Allocation (LDA), a topic model designed for text Spark
LDALatent Dirichlet Allocation (LDA), a topic model designed for text documents.Classorg.apache.spark.mllib.clusteringApache Spark
LDAModel Model fitted by Spark
LDAModelLatent Dirichlet Allocation (LDA) model.Classorg.apache.spark.mllib.clusteringApache Spark
LDAOptimizer An LDAOptimizer specifies which optimization/learning/inference algorithm to use, and it can hold optimizer-specific parameters for users to set.Interfaceorg.apache.spark.mllib.clusteringApache Spark
LeafNode Decision tree leaf Spark
LeastSquaresAggregatorLeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online Spark
LeastSquaresCostFunLeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares Spark
LeastSquaresGradient Compute gradient and loss for a Least-squared loss function, as used in linear regression.Classorg.apache.spark.mllib.optimizationApache Spark
LessThanA filter that evaluates to true iff the attribute evaluates to a valueSince:1.Classorg.apache.spark.sql.sourcesApache Spark
LessThanOrEqualA filter that evaluates to true iff the attribute evaluates to a value less than or equal to value.Classorg.apache.spark.sql.sourcesApache Spark
LinearDataGenerator Generate sample data used for Linear Data.Classorg.apache.spark.mllib.utilApache Spark
LinearRegression The learning objective is to minimize the squared error, with Spark
LinearRegressionModel Model produced by Spark
LinearRegressionModelRegression model trained using LinearRegression.Classorg.apache.spark.mllib.regressionApache Spark Spark Spark
LinearRegressionWithSGDTrain a linear regression model with no regularization using Stochastic Gradient Descent.Classorg.apache.spark.mllib.regressionApache Spark
Loader Trait for classes which can load models and transformers from files.Interfaceorg.apache.spark.mllib.utilApache Spark
LocalLDAModel Local (non-distributed) model fitted by Spark
LocalLDAModel This model stores only the inferred topics.Classorg.apache.spark.mllib.clusteringApache Spark
LoggingUtility trait for classes that want to log data.Interfaceorg.apache.sparkApache Spark
LogisticAggregatorLogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online Spark
LogisticCostFunLogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression) Spark
LogisticGradient Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression).Classorg.apache.spark.mllib.optimizationApache Spark
LogisticRegression Logistic Spark
LogisticRegressionDataGenerator Generate test data for LogisticRegression.Classorg.apache.spark.mllib.utilApache Spark
LogisticRegressionModel Model produced by Spark
LogisticRegressionModelClassification model trained using Multinomial/Binary Logistic Regression.Classorg.apache.spark.mllib.classificationApache Spark
LogisticRegressionSummaryAbstraction for Logistic Regression Results for a given Spark
LogisticRegressionTrainingSummaryAbstraction for multinomial Logistic Regression Training Spark
LogisticRegressionWithLBFGSTrain a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.Classorg.apache.spark.mllib.classificationApache Spark
LogisticRegressionWithSGDTrain a classification model for Binary Logistic Regression using Stochastic Gradient Descent.Classorg.apache.spark.mllib.classificationApache Spark
LogLoss Class for log loss calculation (for classification).Classorg.apache.spark.mllib.tree.lossApache Spark
LogNormalGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
LongParam Specialized version of Param[Long] for Spark
LongType The data type representing Long values.Classorg.apache.spark.sql.typesApache Spark
Loss Trait for adding "pluggable" loss functions for the gradient boosting algorithm.Interfaceorg.apache.spark.mllib.tree.lossApache Spark
LossesClassorg.apache.spark.mllib.tree.lossApache Spark
LZ4CompressionCodec LZ4 implementation of CompressionCodec.Classorg.apache.spark.ioApache Spark
LZFCompressionCodec LZF implementation of CompressionCodec.Classorg.apache.spark.ioApache Spark
MapFunctionBase interface for a map function used in Dataset's map Spark
MapGroupsFunctionBase interface for a map function used in GroupedDataset's mapGroup Spark
MapPartitionsFunctionBase interface for function used in Dataset's Spark
MapType The data type for Maps.Classorg.apache.spark.sql.typesApache Spark
MapWithStateDStream DStream representing the stream of data generated by mapWithState operation on a Additionally, it also gives access to the stream of state snapshots, that is, the state data ofClassorg.apache.spark.streaming.dstreamApache Spark
MatricesFactory methods for Matrix.Classorg.apache.spark.mllib.linalgApache Spark
MatrixTrait for a local matrix.Interfaceorg.apache.spark.mllib.linalgApache Spark
MatrixEntryRepresents an entry in an distributed matrix.Classorg.apache.spark.mllib.linalg.distributedApache Spark
MatrixFactorizationModelModel representing the result of matrix factorization.Classorg.apache.spark.mllib.recommendationApache Spark
MemoryEntryClassorg.apache.spark.storageApache Spark
Metadata Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean, Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], andClassorg.apache.spark.sql.typesApache Spark
MetadataBuilder Builder for Metadata.Classorg.apache.spark.sql.typesApache Spark
MethodIdentifierHelper class to identify a method.Classorg.apache.spark.utilApache Spark
MFDataGenerator Generate RDD(s) containing data for Matrix Factorization.Classorg.apache.spark.mllib.utilApache Spark
MillisecondsHelper object that creates instance of Duration representing a given number of milliseconds.Classorg.apache.spark.streamingApache Spark
MinMaxScaler Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Spark Spark
MinutesHelper object that creates instance of Duration representing a given number of minutes.Classorg.apache.spark.streamingApache Spark
MLPairRDDFunctionsMachine learning specific Pair RDD functions.Classorg.apache.spark.mllib.rddApache Spark
MLReadableTrait for objects that provide Spark
MLReaderAbstract class for utility classes that can load ML Spark
MLUtilsHelper methods to load, save and pre-process data used in ML Lib.Classorg.apache.spark.mllib.utilApache Spark
MLWritableTrait for classes that provide Spark
MLWriterAbstract class for utility classes that can save ML Spark
Model A fitted model, i.Classorg.apache.spark.mlApache Spark
MQTTUtilsClassorg.apache.spark.streaming.mqttApache Spark
MsSqlServerDialectClassorg.apache.spark.sql.jdbcApache Spark
MulticlassClassificationEvaluator Evaluator for multiclass classification, which expects two input columns: score and Spark
MulticlassMetrics Evaluator for multiclass classification.Classorg.apache.spark.mllib.evaluationApache Spark
MultilabelMetricsEvaluator for multilabel classification.Classorg.apache.spark.mllib.evaluationApache Spark
MultilayerPerceptronClassificationModel Classification model based on the Multilayer Spark
MultilayerPerceptronClassifier Classifier trainer based on the Multilayer Spark
MultivariateGaussian This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution.Classorg.apache.spark.mllib.stat.distributionApache Spark
MultivariateOnlineSummarizer MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vectorClassorg.apache.spark.mllib.statApache Spark
MultivariateStatisticalSummaryTrait for multivariate statistical summary of a data matrix.Interfaceorg.apache.spark.mllib.statApache Spark
MutableAggregationBuffer A Row representing an mutable aggregation buffer.Classorg.apache.spark.sql.expressionsApache Spark
MutablePair A tuple of 2 elements.Classorg.apache.spark.utilApache Spark
MySQLDialectClassorg.apache.spark.sql.jdbcApache Spark
NaiveBayes Naive Bayes Spark
NaiveBayesClassorg.apache.spark.mllib.classificationApache Spark
NaiveBayesModel Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) Spark
NaiveBayesModelModel for Naive Bayes Classifiers.Classorg.apache.spark.mllib.classificationApache Spark
NarrowDependency Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD.Classorg.apache.sparkApache Spark
NewHadoopRDD An RDD that provides core functionality for reading data stored in Hadoop (e.Classorg.apache.spark.rddApache Spark
NGram A feature transformer that converts the input array of strings into an array of Spark
Node Decision tree node Spark
Node Node in a decision tree.Classorg.apache.spark.mllib.tree.modelApache Spark
NominalAttribute A nominal Spark
NoopDialectNOOP dialect object, always returning the neutral element.Classorg.apache.spark.sql.jdbcApache Spark
Normalizer Normalize a vector to have unit norm using the given Spark
NormalizerNormalizes samples individually to unit L^p^ norm For any 1 <= p < Double.Classorg.apache.spark.mllib.featureApache Spark
NotA filter that evaluates to true iff child is evaluated to false.Classorg.apache.spark.sql.sourcesApache Spark
NullType The data type representing NULL values.Classorg.apache.spark.sql.typesApache Spark
NumericAttribute A numeric attribute with optional summary Spark
NumericType Numeric data types.Classorg.apache.spark.sql.typesApache Spark
OffsetRangeRepresents a range of offsets from a single Kafka TopicAndPartition.Classorg.apache.spark.streaming.kafkaApache Spark
OneHotEncoder A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category Spark
OneToOneDependency Represents a one-to-one dependency between partitions of the parent and child RDDs.Classorg.apache.sparkApache Spark
OneVsRest Reduction of Multiclass Classification to Binary Spark
OneVsRestModel Model produced by Spark
OnlineLDAOptimizer An online optimizer for LDA.Classorg.apache.spark.mllib.clusteringApache Spark
Optimizer Trait for optimization problem solvers.Interfaceorg.apache.spark.mllib.optimizationApache Spark
OrA filter that evaluates to true iff at least one of left or right evaluates to true.Classorg.apache.spark.sql.sourcesApache Spark
OracleDialectClassorg.apache.spark.sql.jdbcApache Spark
OrderedRDDFunctionsExtra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion.Classorg.apache.spark.rddApache Spark
OutputMetricDistributionsClassorg.apache.spark.status.api.v1Apache Spark
OutputMetricsClassorg.apache.spark.status.api.v1Apache Spark
OutputOperationInfo Class having information on output operations.Classorg.apache.spark.streaming.schedulerApache Spark
OutputWriter OutputWriter is used together with HadoopFsRelation for persisting rows to the underlying file system.Classorg.apache.spark.sql.sourcesApache Spark
OutputWriterFactory A factory that produces OutputWriters.Classorg.apache.spark.sql.sourcesApache Spark
PageRankPageRank algorithm implementation.Classorg.apache.spark.graphx.libApache Spark
PairDStreamFunctionsExtra functions available on DStream of (key, value) pairs through an implicit conversion.Classorg.apache.spark.streaming.dstreamApache Spark
PairFlatMapFunctionA function that returns zero or more key-value pair records from each input Spark
PairFunctionA function that returns key-value pairs (Tuple2), and can be used to construct Spark
PairRDDFunctionsExtra functions available on RDDs of (key, value) pairs through an implicit conversion.Classorg.apache.spark.rddApache Spark
PairwiseRRDDForm an RDD[(Int, Array[Byte])] from key-value pairs returned from R.Classorg.apache.spark.api.rApache Spark
Param A param with self-contained documentation and optionally default Spark
ParamGridBuilder Builder for a param grid used in grid search-based model Spark
ParamMap A param to value Spark
ParamPair A param and its Spark Spark
ParamValidators Factory methods for common validation functions for Spark
PartialResultClassorg.apache.spark.partialApache Spark
PartitionAn identifier for a partition in an RDD.Interfaceorg.apache.sparkApache Spark
PartitionCoalescerCoalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones.Classorg.apache.spark.rddApache Spark
PartitionerAn object that defines how the elements in a key-value pair RDD are partitioned by key.Classorg.apache.sparkApache Spark
PartitionGroupClassorg.apache.spark.rddApache Spark
PartitionPruningRDD A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions.Classorg.apache.spark.rddApache Spark
PartitionStrategyInterfaceorg.apache.spark.graphxApache Spark
PCA PCA trains a model to project vectors to a low-dimensional space using Spark
PCAA feature transformer that projects vectors to a low-dimensional space using PCA.Classorg.apache.spark.mllib.featureApache Spark Spark
PCAModelModel fitted by PCA that can project vectors to a low-dimensional space using PCA.Classorg.apache.spark.mllib.featureApache Spark
Pipeline A simple pipeline, which acts as an estimator.Classorg.apache.spark.mlApache Spark
PipelineModel Represents a fitted pipeline.Classorg.apache.spark.mlApache Spark
PipelineStage A stage in a pipeline, either an Estimator or a Transformer.Classorg.apache.spark.mlApache Spark
PMMLExportable Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file formatInterfaceorg.apache.spark.mllib.pmmlApache Spark
PoissonGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
PoissonSampler A sampler for sampling with replacement, based on values drawn from Poisson distribution.Classorg.apache.spark.util.randomApache Spark
PolynomialExpansion Perform feature expansion in a polynomial Spark
PortableDataStreamA class that allows DataStreams to be serialized and moved around by not creating them until they need to be readClassorg.apache.spark.inputApache Spark
PostgresDialectClassorg.apache.spark.sql.jdbcApache Spark
PowerIterationClusteringClassorg.apache.spark.mllib.clusteringApache Spark
PowerIterationClusteringModelModel produced by PowerIterationClustering.Classorg.apache.spark.mllib.clusteringApache Spark
PrecisionInfoPrecision parameters for a DecimalSee Also:Serialized FormClassorg.apache.spark.sql.typesApache Spark
PredictPredicted value for a node param: predict predicted valueClassorg.apache.spark.mllib.tree.modelApache Spark
PredictionModel Abstraction for a model for prediction tasks (regression and classification).Classorg.apache.spark.mlApache Spark
Predictor Abstraction for prediction problems (regression and classification).Classorg.apache.spark.mlApache Spark
PrefixSpan A parallel PrefixSpan algorithm to mine frequent sequential patterns.Classorg.apache.spark.mllib.fpmApache Spark
PrefixSpanModelModel fitted by PrefixSpan param: freqSequences frequent sequencesClassorg.apache.spark.mllib.fpmApache Spark
Pregel Unlike the original Pregel API, the GraphX Pregel API factors the sendMessage computation over edges, enables the message sending computation to read both vertex attributes, and constrainsClassorg.apache.spark.graphxApache Spark
PrivateA class that is considered private to the internals of Spark -- there is a high-likelihood they will be changed in future versions of Spark.Classorg.apache.spark.annotationApache Spark
ProbabilisticClassificationModel Model produced by a Spark
ProbabilisticClassifier Single-label binary or multiclass classifier which can output class conditional Spark
PrunedFilteredScan A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects.Interfaceorg.apache.spark.sql.sourcesApache Spark
PrunedScan A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects.Interfaceorg.apache.spark.sql.sourcesApache Spark
QRDecompositionClassorg.apache.spark.mllib.linalgApache Spark
QuantileDiscretizer QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical Spark
QuantileStrategyEnum for selecting the quantile calculation strategySee Also:Serialized FormClassorg.apache.spark.mllib.tree.configurationApache Spark
QueryExecutionListener The interface of query execution listener that can be used to analyze execution metrics.Interfaceorg.apache.spark.sql.utilApache Spark
RandomDataGenerator Trait for random data generators that generate i.Interfaceorg.apache.spark.mllib.randomApache Spark
RandomForestA class that implements a Random Forest learning algorithm for classification and regression.Classorg.apache.spark.mllib.treeApache Spark
RandomForestClassificationModel Random Forest model for Spark
RandomForestClassifier Random Forest learning algorithm for It supports both binary and multiclass labels, as well as both continuous and Spark
RandomForestModelRepresents a random forest model.Classorg.apache.spark.mllib.tree.modelApache Spark
RandomForestRegressionModel Random Forest model for Spark
RandomForestRegressor Random Forest learning algorithm for Spark
RandomRDDsGenerator methods for creating RDDs comprised of i.Classorg.apache.spark.mllib.randomApache Spark
RandomSampler A pseudorandom sampler.Interfaceorg.apache.spark.util.randomApache Spark
RangeDependency Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs.Classorg.apache.sparkApache Spark
RangePartitionerA Partitioner that partitions sortable records by range into roughly equal ranges.Classorg.apache.sparkApache Spark
RankingMetrics Evaluator for ranking algorithms.Classorg.apache.spark.mllib.evaluationApache Spark
RatingA more compact class to represent a rating than Tuple3[Int, Int, Double].Classorg.apache.spark.mllib.recommendationApache Spark
RDDA Resilient Distributed Dataset (RDD), the basic abstraction in Spark.Classorg.apache.spark.rddApache Spark
RDDBlockIdClassorg.apache.spark.storageApache Spark
RDDDataDistributionClassorg.apache.spark.status.api.v1Apache Spark
RDDFunctionsMachine learning specific RDD functions.Classorg.apache.spark.mllib.rddApache Spark
RDDInfoClassorg.apache.spark.storageApache Spark
RDDPartitionInfoClassorg.apache.spark.status.api.v1Apache Spark
RDDStorageInfoClassorg.apache.spark.status.api.v1Apache Spark
Receiver Abstract class of a receiver that can be run on worker nodes to receive external data.Classorg.apache.spark.streaming.receiverApache Spark
ReceiverInfo Class having information about a receiverSee Also:Serialized FormClassorg.apache.spark.streaming.schedulerApache Spark
ReceiverInputDStreamAbstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data.Classorg.apache.spark.streaming.dstreamApache Spark
ReduceFunctionBase interface for function used in Dataset's Spark
RegexTokenizer A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false) Spark
RegressionEvaluator Evaluator for regression, which expects two input columns: prediction and Spark
RegressionMetricsEvaluator for regression.Classorg.apache.spark.mllib.evaluationApache Spark
RegressionModel Model produced by a Spark
RegressionModelInterfaceorg.apache.spark.mllib.regressionApache Spark
RelationProvider Implemented by objects that produce relations for a specific kind of data source.Interfaceorg.apache.spark.sql.sourcesApache Spark
Resubmitted A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed.Classorg.apache.sparkApache Spark
ReturnStatementFinderClassorg.apache.spark.utilApache Spark
ReviveOffersClassorg.apache.spark.scheduler.localApache Spark
RFormula Implements the transforms required for fitting a dataset against an R model Spark
RFormulaModel A fitted Spark
RidgeRegressionModelRegression model trained using RidgeRegression.Classorg.apache.spark.mllib.regressionApache Spark
RidgeRegressionWithSGDTrain a regression model with L2-regularization using Stochastic Gradient Descent.Classorg.apache.spark.mllib.regressionApache Spark
RowRepresents one row of output from a relational operator.Interfaceorg.apache.spark.sqlApache Spark
RowFactoryA factory class used to construct Row objects.Classorg.apache.spark.sqlApache Spark
RowMatrixRepresents a row-oriented distributed Matrix with no meaningful row indices.Classorg.apache.spark.mllib.linalg.distributedApache Spark
RpcUtilsClassorg.apache.spark.utilApache Spark
RRDDAn RDD that stores serialized R objects as Array[Byte].Classorg.apache.spark.api.rApache Spark
RuntimePercentageClassorg.apache.spark.schedulerApache Spark
Saveable Trait for models and transformers which may be saved as files.Interfaceorg.apache.spark.mllib.utilApache Spark
SaveModeSaveMode is used to specify the expected behavior of saving a DataFrame to a data source.Classorg.apache.spark.sqlApache Spark
SchedulingMode"FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queuesClassorg.apache.spark.schedulerApache Spark
SchemaRelationProvider Implemented by objects that produce relations for a specific kind of data source with a given schema.Interfaceorg.apache.spark.sql.sourcesApache Spark
ScriptTransformationWriterThreadClassorg.apache.spark.sql.hive.executionApache Spark
SecondsHelper object that creates instance of Duration representing a given number of seconds.Classorg.apache.spark.streamingApache Spark
SequenceFileRDDFunctionsExtra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion.Classorg.apache.spark.rddApache Spark
SerializableWritableClassorg.apache.sparkApache Spark
SerializationStream A stream for writing serialized objects.Classorg.apache.spark.serializerApache Spark
Serializer A serializer.Classorg.apache.spark.serializerApache Spark
SerializerInstance An instance of a serializer, for use by one thread at a time.Classorg.apache.spark.serializerApache Spark
ShortestPathsComputes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark.Classorg.apache.spark.graphx.libApache Spark
ShortType The data type representing Short values.Classorg.apache.spark.sql.typesApache Spark
ShuffleBlockIdClassorg.apache.spark.storageApache Spark
ShuffleDataBlockIdClassorg.apache.spark.storageApache Spark
ShuffleDependency Represents a dependency on the output of a shuffle stage.Classorg.apache.sparkApache Spark
ShuffledRDD The resulting RDD from a shuffle (e.Classorg.apache.spark.rddApache Spark
ShuffleIndexBlockIdClassorg.apache.spark.storageApache Spark
ShuffleReadMetricDistributionsClassorg.apache.spark.status.api.v1Apache Spark
ShuffleReadMetricsClassorg.apache.spark.status.api.v1Apache Spark
ShuffleWriteMetricDistributionsClassorg.apache.spark.status.api.v1Apache Spark
ShuffleWriteMetricsClassorg.apache.spark.status.api.v1Apache Spark
SignalLoggerHandlerClassorg.apache.spark.utilApache Spark
SimpleFutureActionA FutureAction holding the result of an action that triggers a single job.Classorg.apache.sparkApache Spark
SimpleUpdater A simple updater for gradient descent *without* any regularization.Classorg.apache.spark.mllib.optimizationApache Spark
SingularValueDecompositionRepresents singular value decomposition (SVD) factors.Classorg.apache.spark.mllib.linalgApache Spark
SizeEstimator Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches.Classorg.apache.spark.utilApache Spark
SnappyCompressionCodec Snappy implementation of CompressionCodec.Classorg.apache.spark.ioApache Spark
SnappyOutputStreamWrapperWrapper over SnappyOutputStream which guards against write-after-close and double-close issues.Classorg.apache.spark.ioApache Spark
SparkAppHandleA handle to a running Spark application.Interfaceorg.apache.spark.launcherApache Spark
SparkConfConfiguration for a Spark application.Classorg.apache.sparkApache Spark
SparkContextMain entry point for Spark functionality.Classorg.apache.sparkApache Spark
SparkEnv Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc.Classorg.apache.sparkApache Spark
SparkExceptionClassorg.apache.sparkApache Spark
SparkFilesResolves paths to files added through SparkContext.Classorg.apache.sparkApache Spark
SparkFirehoseListenerClass that allows users to receive all SparkListener events.Classorg.apache.sparkApache Spark
SparkFlumeEventA wrapper class for AvroFlumeEvent's with a custom serialization format.Classorg.apache.spark.streaming.flumeApache Spark
SparkJobInfoExposes information about Spark Jobs.Interfaceorg.apache.sparkApache Spark
SparkJobInfoImplClassorg.apache.sparkApache Spark
SparkLauncherLauncher for Spark applications.Classorg.apache.spark.launcherApache Spark
SparkListener Interface for listening to events from the Spark scheduler.Interfaceorg.apache.spark.schedulerApache Spark
SparkListenerApplicationEndClassorg.apache.spark.schedulerApache Spark
SparkListenerApplicationStartClassorg.apache.spark.schedulerApache Spark
SparkListenerBlockManagerAddedClassorg.apache.spark.schedulerApache Spark
SparkListenerBlockManagerRemovedClassorg.apache.spark.schedulerApache Spark
SparkListenerBlockUpdatedClassorg.apache.spark.schedulerApache Spark
SparkListenerEnvironmentUpdateClassorg.apache.spark.schedulerApache Spark
SparkListenerExecutorAddedClassorg.apache.spark.schedulerApache Spark
SparkListenerExecutorMetricsUpdatePeriodic updates from executors.Classorg.apache.spark.schedulerApache Spark
SparkListenerExecutorRemovedClassorg.apache.spark.schedulerApache Spark
SparkListenerJobEndClassorg.apache.spark.schedulerApache Spark
SparkListenerJobStartClassorg.apache.spark.schedulerApache Spark
SparkListenerStageCompletedClassorg.apache.spark.schedulerApache Spark
SparkListenerStageSubmittedClassorg.apache.spark.schedulerApache Spark
SparkListenerTaskEndClassorg.apache.spark.schedulerApache Spark
SparkListenerTaskGettingResultClassorg.apache.spark.schedulerApache Spark
SparkListenerTaskStartClassorg.apache.spark.schedulerApache Spark
SparkListenerUnpersistRDDClassorg.apache.spark.schedulerApache Spark
SparkMasterRegexA collection of regexes for extracting information from the master string.Classorg.apache.sparkApache Spark
SparkShutdownHookClassorg.apache.spark.utilApache Spark
SparkStageInfoExposes information about Spark Stages.Interfaceorg.apache.sparkApache Spark
SparkStageInfoImplClassorg.apache.sparkApache Spark
SparkStatusTrackerLow-level status reporting APIs for monitoring job and stage progress.Classorg.apache.sparkApache Spark
SparseMatrixColumn-major sparse matrix.Classorg.apache.spark.mllib.linalgApache Spark
SparseVectorA sparse vector represented by an index array and an value array.Classorg.apache.spark.mllib.linalgApache Spark
SpecialLengthsClassorg.apache.spark.api.rApache Spark
SpillListenerA SparkListener that detects whether spills have occurred in Spark jobs.Classorg.apache.sparkApache Spark
Split Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right Spark
Split Split applied to a feature param: feature feature indexClassorg.apache.spark.mllib.tree.modelApache Spark
SplitInfoClassorg.apache.spark.schedulerApache Spark
SQLContextThe entry point for working with structured data (rows and columns) in Spark.Classorg.apache.spark.sqlApache Spark
SQLImplicitsA collection of implicit methods for converting common Scala objects into DataFrames.Classorg.apache.spark.sqlApache Spark
SQLTransformer Implements the transformations which are defined by SQL Spark
SQLUserDefinedType A user-defined type which can be automatically recognized by a SQLContext and registered.Classorg.apache.spark.sql.typesApache Spark
SquaredError Class for squared error loss calculation.Classorg.apache.spark.mllib.tree.lossApache Spark
SquaredL2Updater Updater for L2 regularized problems.Classorg.apache.spark.mllib.optimizationApache Spark
StageDataClassorg.apache.spark.status.api.v1Apache Spark
StageInfo Stores information about a stage to pass from the scheduler to SparkListeners.Classorg.apache.spark.schedulerApache Spark
StageStatusClassorg.apache.spark.status.api.v1Apache Spark
StandardNormalGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
StandardScaler Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training Spark
StandardScalerStandardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set.Classorg.apache.spark.mllib.featureApache Spark Spark
StandardScalerModelRepresents a StandardScaler model that can transform vectors.Classorg.apache.spark.mllib.featureApache Spark
StatCounterA class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way.Classorg.apache.spark.utilApache Spark
State Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala)Classorg.apache.spark.streamingApache Spark
StateSpec Abstract class representing all the specifications of the DStream transformation mapWithState operation of aClassorg.apache.spark.streamingApache Spark
StatisticsClassorg.apache.spark.mllib.statApache Spark
Statistics Statistics for querying the supervisor about state of workers.Classorg.apache.spark.streaming.receiverApache Spark
StatsReportListenerClassorg.apache.spark.schedulerApache Spark
StatsReportListener A simple StreamingListener that logs summary statistics across Spark Streaming batches param: numBatchInfos Number of last batches to consider for generating statistics (default: 10)Classorg.apache.spark.streaming.schedulerApache Spark
StatusUpdateClassorg.apache.spark.scheduler.localApache Spark
StopCoordinatorClassorg.apache.spark.schedulerApache Spark
StopExecutorClassorg.apache.spark.scheduler.localApache Spark
StopWordsRemover A feature transformer that filters out stop words from Spark
StorageLevel Flags for controlling the storage of an RDD.Classorg.apache.spark.storageApache Spark
StorageLevelsExpose some commonly useful storage level constants.Classorg.apache.spark.api.javaApache Spark
StorageListener A SparkListener that prepares information to be displayed on the BlockManagerUI.Classorg.apache.spark.ui.storageApache Spark
StorageStatus Storage information for each BlockManager.Classorg.apache.spark.storageApache Spark
StorageStatusListener A SparkListener that maintains executor storage status.Classorg.apache.spark.storageApache Spark
StrategyStores all the configuration options for tree construction param: algo Learning goal.Classorg.apache.spark.mllib.tree.configurationApache Spark
StreamBlockIdClassorg.apache.spark.storageApache Spark
StreamingContextMain entry point for Spark Streaming functionality.Classorg.apache.spark.streamingApache Spark
StreamingContextStateenum StreamingContextState Represents the state of a StreamingContext.Classorg.apache.spark.streamingApache Spark
StreamingKMeansStreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming,Classorg.apache.spark.mllib.clusteringApache Spark
StreamingKMeansModelStreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weightClassorg.apache.spark.mllib.clusteringApache Spark
StreamingLinearAlgorithm StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data,Classorg.apache.spark.mllib.regressionApache Spark
StreamingLinearRegressionWithSGDTrain or predict a linear regression model on streaming data.Classorg.apache.spark.mllib.regressionApache Spark
StreamingListenerInterfaceorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerBatchCompletedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerBatchStartedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerBatchSubmittedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerOutputOperationCompletedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerOutputOperationStartedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerReceiverErrorClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerReceiverStartedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingListenerReceiverStoppedClassorg.apache.spark.streaming.schedulerApache Spark
StreamingLogisticRegressionWithSGDTrain or predict a logistic regression model on streaming data.Classorg.apache.spark.mllib.classificationApache Spark
StreamingTestClassorg.apache.spark.mllib.stat.testApache Spark
StreamInputInfo Track the information of input stream at specified batch time.Classorg.apache.spark.streaming.schedulerApache Spark
StringArrayParam Specialized version of Param[Array[String} for Spark
StringContainsA filter that evaluates to true iff the attribute evaluates to a string that contains the string value.Classorg.apache.spark.sql.sourcesApache Spark
StringEndsWithA filter that evaluates to true iff the attribute evaluates to a string that starts with value.Classorg.apache.spark.sql.sourcesApache Spark
StringIndexer A label indexer that maps a string column of labels to an ML column of label Spark
StringIndexerModel Model fitted by Spark
StringRRDDAn RDD that stores R objects as Array[String].Classorg.apache.spark.api.rApache Spark
StringStartsWithA filter that evaluates to true iff the attribute evaluates to a string that starts with value.Classorg.apache.spark.sql.sourcesApache Spark
StringType The data type representing String values.Classorg.apache.spark.sql.typesApache Spark
StronglyConnectedComponentsStrongly connected components algorithm implementation.Classorg.apache.spark.graphx.libApache Spark
StructFieldA field inside a StructType.Classorg.apache.spark.sql.typesApache Spark
StructType A StructType object can be constructed by StructType(fields: Seq[StructField])Classorg.apache.spark.sql.typesApache Spark
SuccessClassorg.apache.sparkApache Spark
SVDPlusPlusClassorg.apache.spark.graphx.libApache Spark
SVMDataGenerator Generate sample data used for SVM.Classorg.apache.spark.mllib.utilApache Spark
SVMModelModel for Support Vector Machines (SVMs).Classorg.apache.spark.mllib.classificationApache Spark
SVMWithSGDTrain a Support Vector Machine (SVM) using Stochastic Gradient Descent.Classorg.apache.spark.mllib.classificationApache Spark
TaskCommitDenied Task requested the driver to commit, but was denied.Classorg.apache.sparkApache Spark
TaskCompletionListener Listener providing a callback function to invoke when a task's execution completes.Interfaceorg.apache.spark.utilApache Spark
TaskContextContextual information about a task which can be read or mutated during execution.Classorg.apache.sparkApache Spark
TaskDataClassorg.apache.spark.status.api.v1Apache Spark
TaskFailedReason Various possible reasons why a task failed.Interfaceorg.apache.sparkApache Spark
TaskInfo Information about a running task attempt inside a TaskSet.Classorg.apache.spark.schedulerApache Spark
TaskKilled Task was killed intentionally and needs to be rescheduled.Classorg.apache.sparkApache Spark
TaskKilledException Exception thrown when a task is explicitly killed (i.Classorg.apache.sparkApache Spark
TaskLocalityClassorg.apache.spark.schedulerApache Spark
TaskMetricDistributionsClassorg.apache.spark.status.api.v1Apache Spark
TaskMetricsClassorg.apache.spark.status.api.v1Apache Spark
TaskResultBlockIdClassorg.apache.spark.storageApache Spark
TaskResultLost The task finished successfully, but the result was lost from the executor's block manager beforeSee Also:Serialized FormClassorg.apache.sparkApache Spark
TaskSortingClassorg.apache.spark.status.api.v1Apache Spark
TestResultTrait for hypothesis test results.Interfaceorg.apache.spark.mllib.stat.testApache Spark
TimeThis is a simple class that represents an absolute instant of time.Classorg.apache.spark.streamingApache Spark
TimestampType The data type representing java.Classorg.apache.spark.sql.typesApache Spark
TimeTrackingOutputStreamIntercepts write calls and tracks total time spent writing in order to update shuffle write metrics.Classorg.apache.spark.storageApache Spark
Tokenizer A tokenizer that converts the input string to lowercase and then splits it by white Spark
TorrentBroadcastFactoryA Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors.Classorg.apache.spark.broadcastApache Spark
TrainValidationSplit Validation for hyper-parameter Spark
TrainValidationSplitModel Model from train validation Spark
Transformer Abstract class for transformers that transform one dataset into another.Classorg.apache.spark.mlApache Spark
TriangleCountCompute the number of triangles passing through each vertex.Classorg.apache.spark.graphx.libApache Spark
TripletFieldsRepresents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]].Classorg.apache.spark.graphxApache Spark
TwitterUtilsClassorg.apache.spark.streaming.twitterApache Spark
TypedColumnA Column where an Encoder has been given for the expected input and return type.Classorg.apache.spark.sqlApache Spark
UDF10A Spark SQL UDF that has 10 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF11A Spark SQL UDF that has 11 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF12A Spark SQL UDF that has 12 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF13A Spark SQL UDF that has 13 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF14A Spark SQL UDF that has 14 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF15A Spark SQL UDF that has 15 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF16A Spark SQL UDF that has 16 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF17A Spark SQL UDF that has 17 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF18A Spark SQL UDF that has 18 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF19A Spark SQL UDF that has 19 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF20A Spark SQL UDF that has 20 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF21A Spark SQL UDF that has 21 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF22A Spark SQL UDF that has 22 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF3A Spark SQL UDF that has 3 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF4A Spark SQL UDF that has 4 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF5A Spark SQL UDF that has 5 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF6A Spark SQL UDF that has 6 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF7A Spark SQL UDF that has 7 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF8A Spark SQL UDF that has 8 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDF9A Spark SQL UDF that has 9 arguments.Interfaceorg.apache.spark.sql.api.javaApache Spark
UDFRegistrationFunctions for registering user-defined functions.Classorg.apache.spark.sqlApache Spark
UnaryTransformer Abstract class for transformers that take one input column, apply transformation, and output the result as a new column.Classorg.apache.spark.mlApache Spark
UniformGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
UnionRDDClassorg.apache.spark.rddApache Spark
UnknownReason We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result.Classorg.apache.sparkApache Spark
UnresolvedAttribute An unresolved Spark
Updater Class used to perform steps (weight update) using Gradient Descent methods.Classorg.apache.spark.mllib.optimizationApache Spark
UserDefinedAggregateFunction The base class for implementing user-defined aggregate functions (UDAF).Classorg.apache.spark.sql.expressionsApache Spark
UserDefinedFunctionA user-defined function.Classorg.apache.spark.sqlApache Spark
UserDefinedType The data type for User Defined Types (UDTs).Classorg.apache.spark.sql.typesApache Spark
Variance Class for calculating variance during regressionSee Also:Serialized FormClassorg.apache.spark.mllib.tree.impurityApache Spark
VectorRepresents a numeric vector, whose index type is Int and value type is Double.Interfaceorg.apache.spark.mllib.linalgApache Spark
VectorClassorg.apache.spark.utilApache Spark
VectorAssembler A feature transformer that merges multiple columns into a vector Spark
VectorAttributeRewriterUtility transformer that rewrites Vector attribute names via prefix Spark
VectorIndexer Class for indexing categorical feature columns in a dataset of Spark
VectorIndexerModel Transform categorical features to use 0-based indices instead of their original Spark
VectorsClassorg.apache.spark.mllib.linalgApache Spark
VectorSlicer This class takes a feature vector and outputs a new feature vector with a subarray of the The subset of features can be specified with either indices (setIndices()) Spark
VectorTransformerInterfaceorg.apache.spark.mllib.featureApache Spark
VectorUDT:: AlphaComponent :: User-defined type for Vector which allows easy interaction with SQLClassorg.apache.spark.mllib.linalgApache Spark
VertexRDD pre-indexing the entries for fast, efficient joins.Classorg.apache.spark.graphxApache Spark
VertexRDDImplClassorg.apache.spark.graphx.implApache Spark
VocabWordClassorg.apache.spark.mllib.featureApache Spark
VoidFunction2A two-argument function that takes arguments of type T1 and T2 with no return Spark
WeibullGenerator Generates i.Classorg.apache.spark.mllib.randomApache Spark
Window Utility functions for defining window in DataFrames.Classorg.apache.spark.sql.expressionsApache Spark
WindowSpec A window specification that defines the partitioning, ordering, and frame boundaries.Classorg.apache.spark.sql.expressionsApache Spark
Word2Vec Word2Vec trains a model of Map(String, Vector), Spark
Word2VecClassorg.apache.spark.mllib.featureApache Spark
Word2VecModel Model fitted by Spark
Word2VecModelClassorg.apache.spark.mllib.featureApache Spark
WriteAheadLog This abstract class represents a write ahead log (aka journal) that is used by Spark Streaming to save the received data (by receivers) and associated metadata to a reliable storage, so thatClassorg.apache.spark.streaming.utilApache Spark
WriteAheadLogRecordHandle This abstract class represents a handle that refers to a record written in a It must contain all the information necessary for the record to be read and returned byClassorg.apache.spark.streaming.utilApache Spark
ZeroMQUtilsClassorg.apache.spark.streaming.zeromqApache Spark