Name | Description | Type | Package | Framework |
AbsoluteError | Class for absolute error loss calculation (for regression). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
Accumulable | A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T. | Class | org.apache.spark | Apache Spark |
|
AccumulableInfo | Information about an Accumulable modified during a task or stage. | Class | org.apache.spark.scheduler | Apache Spark |
|
AccumulableInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
AccumulableParam | Helper object defining how to accumulate values of a particular type. | Interface | org.apache.spark | Apache Spark |
|
Accumulator | A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i. | Class | org.apache.spark | Apache Spark |
|
AccumulatorParam | A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value. | Interface | org.apache.spark | Apache Spark |
|
AccumulatorParam .DoubleAccumulatorParam$ | | Class | org.apache.spark.AccumulatorParam | Apache Spark |
|
AccumulatorParam .FloatAccumulatorParam$ | | Class | org.apache.spark.AccumulatorParam | Apache Spark |
|
AccumulatorParam .IntAccumulatorParam$ | | Class | org.apache.spark.AccumulatorParam | Apache Spark |
|
AccumulatorParam .LongAccumulatorParam$ | | Class | org.apache.spark.AccumulatorParam | Apache Spark |
|
ActorHelper | A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed. | Interface | org.apache.spark.streaming.receiver | Apache Spark |
|
ActorSupervisorStrategy | | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
AFTAggregator | | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTCostFun | | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTSurvivalRegression | Fit a parametric survival regression model named accelerated failure time (AFT) model (https://en. | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTSurvivalRegressionModel | Model produced by AFTSurvivalRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
AggregatedDialect | AggregatedDialect can unify multiple dialects into one virtual Dialect. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
AggregatingEdgeContext | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
Aggregator | A set of functions used to aggregate data. | Class | org.apache.spark | Apache Spark |
|
Aggregator | A base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
Algo | Enum to select the algorithm for the decision treeSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
AlphaComponent | A new component of Spark which may have unstable API's. | Class | org.apache.spark.annotation | Apache Spark |
|
ALS | Alternating Least Squares (ALS) matrix factorization. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
ALS | | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
ALS .Rating | Rating class for better code readability. | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
ALS .Rating$ | | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
ALSModel | Model fitted by ALS. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
AnalysisException | Thrown when a query fails to analyze, usually because the query itself is invalid. | Class | org.apache.spark.sql | Apache Spark |
|
And | A filter that evaluates to true iff both left or right evaluate to true. | Class | org.apache.spark.sql.sources | Apache Spark |
|
ApplicationAttemptInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ApplicationInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ApplicationStatus | enum ApplicationStatusEnum Constant Summary | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ArrayType | | Class | org.apache.spark.sql.types | Apache Spark |
|
AskPermissionToCommitOutput | | Class | org.apache.spark.scheduler | Apache Spark |
|
AssociationRules | Generates association rules from a RDD[FreqItemset[Item]. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
AssociationRules .Rule | An association rule between sets of items. | Class | org.apache.spark.mllib.fpm.AssociationRules | Apache Spark |
|
AsyncRDDActions | A set of asynchronous RDD actions available through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
Attribute | Abstract class for ML attributes. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
AttributeGroup | Attributes that describe a vector ML column. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
AttributeType | An enum-like type for attribute types: AttributeType$. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
BaseRelation | Represents a collection of tuples with a known schema. | Class | org.apache.spark.sql.sources | Apache Spark |
|
BaseRRDD | | Class | org.apache.spark.api.r | Apache Spark |
|
BatchInfo | Class having information on completed batches. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
BernoulliCellSampler | A sampler based on Bernoulli trials for partitioning a data sequence. | Class | org.apache.spark.util.random | Apache Spark |
|
BernoulliSampler | A sampler based on Bernoulli trials. | Class | org.apache.spark.util.random | Apache Spark |
|
Binarizer | Binarize a column of continuous features given a threshold. | Class | org.apache.spark.ml.feature | Apache Spark |
|
BinaryAttribute | A binary attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
BinaryClassificationEvaluator | Evaluator for binary classification, which expects two input columns: rawPrediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
BinaryClassificationMetrics | Evaluator for binary classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
BinaryLogisticRegressionSummary | Binary Logistic regression results for a given model. | Class | org.apache.spark.ml.classification | Apache Spark |
|
BinaryLogisticRegressionTrainingSummary | Logistic regression training results. | Class | org.apache.spark.ml.classification | Apache Spark |
|
BinarySample | Class that represents the group and value of a sample. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
BinaryType | The data type representing Array[Byte] values. | Class | org.apache.spark.sql.types | Apache Spark |
|
BisectingKMeans | A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
BisectingKMeansModel | Clustering model produced by BisectingKMeans. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
BlockId | Identifies a particular Block of data, usually associated with a single file. | Class | org.apache.spark.storage | Apache Spark |
|
BlockManagerId | This class represent an unique identifier for a BlockManager. | Class | org.apache.spark.storage | Apache Spark |
|
BlockMatrix | Represents a distributed matrix in blocks of local matrices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
BlockNotFoundException | | Class | org.apache.spark.storage | Apache Spark |
|
BlockStatus | | Class | org.apache.spark.storage | Apache Spark |
|
BlockUpdatedInfo | Stores information about a block status in a block manager. | Class | org.apache.spark.storage | Apache Spark |
|
BooleanParam | Specialized version of Param[Boolean] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
BooleanType | The data type representing Boolean values. | Class | org.apache.spark.sql.types | Apache Spark |
|
BoostingStrategy | Configuration options for GradientBoostedTrees. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
BoundedDouble | A Double value with error bars and associated confidence. | Class | org.apache.spark.partial | Apache Spark |
|
Broadcast | A broadcast variable. | Class | org.apache.spark.broadcast | Apache Spark |
|
BroadcastBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
BroadcastFactory | An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations). | Interface | org.apache.spark.broadcast | Apache Spark |
|
Broker | Represents the host and port info for a Kafka broker. | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
Bucketizer | Bucketizer maps a column of continuous features to a column of feature buckets. | Class | org.apache.spark.ml.feature | Apache Spark |
|
BufferReleasingInputStream | Helper class that ensures a ManagedBuffer is release upon InputStream. | Class | org.apache.spark.storage | Apache Spark |
|
ByteType | The data type representing Byte values. | Class | org.apache.spark.sql.types | Apache Spark |
|
CalendarIntervalType | The data type representing calendar time intervals. | Class | org.apache.spark.sql.types | Apache Spark |
|
CatalystScan | An interface for experimenting with a more direct connection to the query planner. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
CategoricalSplit | Split which tests a categorical feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
ChiSqSelector | Chi-Squared feature selection, which selects categorical features to use for predicting aSee Also:Serialized Form | Class | org.apache.spark.ml.feature | Apache Spark |
|
ChiSqSelector | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
ChiSqSelectorModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
ChiSqSelectorModel | Chi Squared selector model. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
ChiSqTestResult | Object containing the test results for the chi-squared hypothesis test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
ClassificationModel | Model produced by a Classifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
ClassificationModel | Represents a classification model that predicts to which of a set of categories an example belongs. | Interface | org.apache.spark.mllib.classification | Apache Spark |
|
Classifier | Single-label binary or multiclass classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
CleanAccum | | Class | org.apache.spark | Apache Spark |
|
CleanBroadcast | | Class | org.apache.spark | Apache Spark |
|
CleanCheckpoint | | Class | org.apache.spark | Apache Spark |
|
CleanRDD | | Class | org.apache.spark | Apache Spark |
|
CleanShuffle | | Class | org.apache.spark | Apache Spark |
|
CleanupTaskWeakReference | A WeakReference associated with a CleanupTask. | Class | org.apache.spark | Apache Spark |
|
CoGroupedRDD | A RDD that cogroups its parents. | Class | org.apache.spark.rdd | Apache Spark |
|
CoGroupFunction | | Interface | org.apache.spark.api.java.function | Apache Spark |
|
Column | A column that will be computed based on the data in a DataFrame. | Class | org.apache.spark.sql | Apache Spark |
|
ColumnName | A convenient class used for constructing schema. | Class | org.apache.spark.sql | Apache Spark |
|
ColumnPruner | Utility transformer for removing temporary columns from a DataFrame. | Class | org.apache.spark.ml.feature | Apache Spark |
|
ComplexFutureAction | A FutureAction for actions that could trigger multiple Spark jobs. | Class | org.apache.spark | Apache Spark |
|
CompressionCodec | CompressionCodec allows the customization of choosing different compression implementations to be used in block storage. | Interface | org.apache.spark.io | Apache Spark |
|
ConnectedComponents | Connected components algorithm. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
ConstantInputDStream | An input stream that always returns the same RDD on each timestep. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
ContinuousSplit | Split which tests a continuous feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
CoordinateMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
CountVectorizer | Extracts a vocabulary from document collections and generates a CountVectorizerModel. | Class | org.apache.spark.ml.feature | Apache Spark |
|
CountVectorizerModel | Converts a text document to a sparse vector of token counts. | Class | org.apache.spark.ml.feature | Apache Spark |
|
CreatableRelationProvider | | Interface | org.apache.spark.sql.sources | Apache Spark |
|
CrossValidator | K-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
CrossValidatorModel | Model from k-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
DataFrame | A distributed collection of data organized into named columns. | Class | org.apache.spark.sql | Apache Spark |
|
DataFrameHolder | A container for a DataFrame, used for implicit conversions. | Class | org.apache.spark.sql | Apache Spark |
|
DataFrameNaFunctions | Functionality for working with missing data in DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
DataFrameReader | Interface used to load a DataFrame from external storage systems (e. | Class | org.apache.spark.sql | Apache Spark |
|
DataFrameStatFunctions | Statistic functions for DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
DataFrameWriter | Interface used to write a DataFrame to external storage systems (e. | Class | org.apache.spark.sql | Apache Spark |
|
Dataset | A Dataset is a strongly typed collection of objects that can be transformed in parallel using functional or relational operations. | Class | org.apache.spark.sql | Apache Spark |
|
DatasetHolder | A container for a Dataset, used for implicit conversions. | Class | org.apache.spark.sql | Apache Spark |
|
DataSourceRegister | Data sources should implement this trait so that they can register an alias to their data source. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
DataType | The base type of all Spark SQL data types. | Class | org.apache.spark.sql.types | Apache Spark |
|
DataTypes | To get/create specific data type, users should use singleton objects and factory methods provided by this class. | Class | org.apache.spark.sql.types | Apache Spark |
|
DataValidators | A collection of methods used to validate data before applying ML algorithms. | Class | org.apache.spark.mllib.util | Apache Spark |
|
DateType | A date type, supporting "0001-01-01" through "9999-12-31". | Class | org.apache.spark.sql.types | Apache Spark |
|
DB2Dialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
DCT | A feature transformer that takes the 1D discrete cosine transform of a real vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Decimal | A mutable implementation of BigDecimal that can hold a Long if values are small enough. | Class | org.apache.spark.sql.types | Apache Spark |
|
DecimalType | | Class | org.apache.spark.sql.types | Apache Spark |
|
DecisionTree | A class which implements a decision tree learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
DecisionTreeClassificationModel | Decision tree model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
DecisionTreeClassifier | Decision tree learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
DecisionTreeModel | Decision tree model for classification or regression. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
DecisionTreeRegressionModel | Decision tree model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
DecisionTreeRegressor | Decision tree learning algorithm It supports both continuous and categorical features. | Class | org.apache.spark.ml.regression | Apache Spark |
|
DefaultSource | libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame. | Class | org.apache.spark.ml.source.libsvm | Apache Spark |
|
DenseMatrix | Column-major dense matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
DenseVector | A dense vector represented by a value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
Dependency | Base class for dependencies. | Class | org.apache.spark | Apache Spark |
|
DerbyDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
DeserializationStream | A stream for reading serialized objects. | Class | org.apache.spark.serializer | Apache Spark |
|
DeveloperApi | A lower-level, unstable API intended for developers. | Class | org.apache.spark.annotation | Apache Spark |
|
DistributedLDAModel | Distributed model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
DistributedLDAModel | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
DistributedMatrix | Represents a distributively stored matrix backed by one or more RDDs. | Interface | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
DoubleArrayParam | Specialized version of Param[Array[Double} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
DoubleFlatMapFunction | A function that returns zero or more records of type Double from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
DoubleFunction | A function that returns Doubles, and can be used to construct DoubleRDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
DoubleParam | Specialized version of Param[Double] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
DoubleRDDFunctions | Extra functions available on RDDs of Doubles through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
DoubleType | The data type representing Double values. | Class | org.apache.spark.sql.types | Apache Spark |
|
DStream | A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
DummySerializerInstance | Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter. | Class | org.apache.spark.serializer | Apache Spark |
|
Duration | | Class | org.apache.spark.streaming | Apache Spark |
|
Durations | | Class | org.apache.spark.streaming | Apache Spark |
|
Edge | A single directed edge consisting of a source id, target id, and the data associated with the edge. | Class | org.apache.spark.graphx | Apache Spark |
|
EdgeActiveness | Criteria for filtering edges based on activeness. | Class | org.apache.spark.graphx.impl | Apache Spark |
|
EdgeContext | Represents an edge along with its neighboring vertices and allows sending messages along the edge. | Class | org.apache.spark.graphx | Apache Spark |
|
EdgeDirection | The direction of a directed edge relative to a vertex. | Class | org.apache.spark.graphx | Apache Spark |
|
EdgeRDD | EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance. | Class | org.apache.spark.graphx | Apache Spark |
|
EdgeRDDImpl | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
EdgeTriplet | An edge triplet represents an edge along with the vertex attributes of its neighboring vertices. | Class | org.apache.spark.graphx | Apache Spark |
|
ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
EMLDAOptimizer | Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Encoder | Used to convert a JVM object of type T to and from the internal Spark SQL representation. | Interface | org.apache.spark.sql | Apache Spark |
|
Encoders | Methods for creating an Encoder. | Class | org.apache.spark.sql | Apache Spark |
|
Entropy | Class for calculating entropy during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
EnumUtil | | Class | org.apache.spark.util | Apache Spark |
|
EnvironmentListener | | Class | org.apache.spark.ui.env | Apache Spark |
|
EqualNullSafe | Performs equality comparison, similar to EqualTo. | Class | org.apache.spark.sql.sources | Apache Spark |
|
EqualTo | A filter that evaluates to true iff the attribute evaluates to a valueSince:1. | Class | org.apache.spark.sql.sources | Apache Spark |
|
Estimator | Abstract class for estimators that fit models to data. | Class | org.apache.spark.ml | Apache Spark |
|
Evaluator | Abstract class for evaluators that compute metrics from predictions. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
ExceptionFailure | Task failed due to a runtime exception. | Class | org.apache.spark | Apache Spark |
|
ExecutionListenerManager | Manager for QueryExecutionListener. | Class | org.apache.spark.sql.util | Apache Spark |
|
ExecutorInfo | Stores information about an executor to pass from the scheduler to SparkListeners. | Class | org.apache.spark.scheduler.cluster | Apache Spark |
|
ExecutorLostFailure | The task failed because the executor that it was running on was lost. | Class | org.apache.spark | Apache Spark |
|
ExecutorRegistered | | Class | org.apache.spark | Apache Spark |
|
ExecutorRemoved | | Class | org.apache.spark | Apache Spark |
|
ExecutorsListener | | Class | org.apache.spark.ui.exec | Apache Spark |
|
ExecutorStageSummary | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ExecutorSummary | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ExpectationSum | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Experimental | An experimental user-facing API. | Class | org.apache.spark.annotation | Apache Spark |
|
ExperimentalMethods | Holder for experimental methods for the bravest. | Class | org.apache.spark.sql | Apache Spark |
|
ExponentialGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
FeatureType | Enum to describe whether a feature is "continuous" or "categorical"See Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
FetchFailed | Task failed to fetch shuffle data from a remote node. | Class | org.apache.spark | Apache Spark |
|
Filter | A filter predicate for data sources. | Class | org.apache.spark.sql.sources | Apache Spark |
|
FilterFunction | Base interface for a function used in Dataset's filter function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
FlatMapFunction | A function that returns zero or more output records from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
FlatMapFunction2 | A function that takes two inputs and returns zero or more output records. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
FlatMapGroupsFunction | A function that returns zero or more output records from each grouping key and its values. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
FloatParam | Specialized version of Param[Float] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
FloatType | The data type representing Float values. | Class | org.apache.spark.sql.types | Apache Spark |
|
FlumeUtils | | Class | org.apache.spark.streaming.flume | Apache Spark |
|
ForeachFunction | Base interface for a function used in Dataset's foreach function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
ForeachPartitionFunction | Base interface for a function used in Dataset's foreachPartition function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
FPGrowth | A parallel FP-growth algorithm to mine frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
FPGrowth .FreqItemset | param: items items in this itemset. | Class | org.apache.spark.mllib.fpm.FPGrowth | Apache Spark |
|
FPGrowthModel | Model trained by FPGrowth, which holds frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
Function | Base interface for functions whose return types do not create special RDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
Function2 | A two-argument function that takes arguments of type T1 and T2 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
Function3 | A three-argument function that takes arguments of type T1, T2 and T3 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
Function4 | A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
functions | | Class | org.apache.spark.sql | Apache Spark |
|
FutureAction | A future for the result of an action to support cancellation. | Interface | org.apache.spark | Apache Spark |
|
GammaGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
GaussianMixture | This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
GaussianMixtureModel | Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
GBTClassificationModel | Gradient-Boosted Trees (GBTs) model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
GBTClassifier | Gradient-Boosted Trees (GBTs) learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
GBTRegressionModel | Gradient-Boosted Trees (GBTs) model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
GBTRegressor | Gradient-Boosted Trees (GBTs) learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
GeneralizedLinearAlgorithm | GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM). | Class | org.apache.spark.mllib.regression | Apache Spark |
|
GeneralizedLinearModel | GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
Gini | Class for calculating the during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
Gradient | Class used to compute the gradient for a loss function, given a single data point. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
GradientBoostedTrees | A class that implements Stochastic Gradient Boosting | Class | org.apache.spark.mllib.tree | Apache Spark |
|
GradientBoostedTreesModel | Represents a gradient boosted trees model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
GradientDescent | Class used to solve an optimization problem using Gradient Descent. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
Graph | The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges. | Class | org.apache.spark.graphx | Apache Spark |
|
GraphGenerators | A collection of graph generating functions. | Class | org.apache.spark.graphx.util | Apache Spark |
|
GraphImpl | An implementation of Graph to support computation on graphs. | Class | org.apache.spark.graphx.impl | Apache Spark |
|
GraphKryoRegistrator | Registers GraphX classes with Kryo for improved performance. | Class | org.apache.spark.graphx | Apache Spark |
|
GraphLoader | Provides utilities for loading Graphs from files. | Class | org.apache.spark.graphx | Apache Spark |
|
GraphOps | Contains additional functionality for Graph. | Class | org.apache.spark.graphx | Apache Spark |
|
GraphXUtils | | Class | org.apache.spark.graphx | Apache Spark |
|
GreaterThan | A filter that evaluates to true iff the attribute evaluates to a value greater than value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
GreaterThanOrEqual | A filter that evaluates to true iff the attribute evaluates to a value greater than or equal to value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
GroupedData | A set of methods for aggregations on a DataFrame, created by DataFrame. | Class | org.apache.spark.sql | Apache Spark |
|
GroupedDataset | A Dataset has been logically grouped by a user specified grouping key. | Class | org.apache.spark.sql | Apache Spark |
|
HadoopFsRelation | A BaseRelation that provides much of the common code required for relations that store their data to an HDFS compatible filesystem. | Class | org.apache.spark.sql.sources | Apache Spark |
|
HadoopFsRelation .FakeFileStatus | | Class | org.apache.spark.sql.sources.HadoopFsRelation | Apache Spark |
|
HadoopFsRelation .FakeFileStatus$ | | Class | org.apache.spark.sql.sources.HadoopFsRelation | Apache Spark |
|
HadoopFsRelationProvider | Implemented by objects that produce relations for a specific kind of data source with a given schema and partitioned columns. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
HadoopRDD | An RDD that provides core functionality for reading data stored in Hadoop (e. | Class | org.apache.spark.rdd | Apache Spark |
|
HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.ml.feature | Apache Spark |
|
HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
HashPartitioner | A Partitioner that implements hash-based partitioning using Java's Object. | Class | org.apache.spark | Apache Spark |
|
HasOffsetRanges | Represents any object that has a collection of OffsetRanges. | Interface | org.apache.spark.streaming.kafka | Apache Spark |
|
HingeGradient | Compute gradient and loss for a Hinge loss function, as used in SVM binary classification. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
HiveContext | An instance of the Spark SQL execution engine that integrates with data stored in Hive. | Class | org.apache.spark.sql.hive | Apache Spark |
|
HttpBroadcastFactory | A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism. | Class | org.apache.spark.broadcast | Apache Spark |
|
Identifiable | Trait for an object with an immutable unique ID that identifies itself and its derivatives. | Interface | org.apache.spark.ml.util | Apache Spark |
|
IDF | Compute the Inverse Document Frequency (IDF) given a collection of documents. | Class | org.apache.spark.ml.feature | Apache Spark |
|
IDF | Inverse document frequency (IDF). | Class | org.apache.spark.mllib.feature | Apache Spark |
|
IDF .DocumentFrequencyAggregator | Document frequency aggregator. | Class | org.apache.spark.mllib.feature.IDF | Apache Spark |
|
IDFModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
IDFModel | Represents an IDF model that can transform term frequency vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Impurity | Trait for calculating information gain. | Interface | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
In | A filter that evaluates to true iff the attribute evaluates to one of the values in the array. | Class | org.apache.spark.sql.sources | Apache Spark |
|
IndexedRow | Represents a row of IndexedRowMatrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
IndexedRowMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
IndexToString | | Class | org.apache.spark.ml.feature | Apache Spark |
|
InformationGainStats | Information gain statistics for each split param: gain information gain value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
InnerClosureFinder | | Class | org.apache.spark.util | Apache Spark |
|
InputDStream | This is the abstract base class for all input streams. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
InputFormatInfo | Parses and holds information about inputFormat (and files) specified as a parameter. | Class | org.apache.spark.scheduler | Apache Spark |
|
InputMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
InputMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
InsertableRelation | A BaseRelation that can be used to insert data into it through the insert method. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
IntArrayParam | Specialized version of Param[Array[Int} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
IntegerType | The data type representing Int values. | Class | org.apache.spark.sql.types | Apache Spark |
|
Interaction | Implements the feature interaction transform. | Class | org.apache.spark.ml.feature | Apache Spark |
|
InternalNode | Internal Decision Tree node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
InterruptibleIterator | An iterator that wraps around an existing iterator to provide task killing functionality. | Class | org.apache.spark | Apache Spark |
|
IntParam | Specialized version of Param[Int] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
IsNotNull | A filter that evaluates to true iff the attribute evaluates to a non-null value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
IsNull | A filter that evaluates to true iff the attribute evaluates to null. | Class | org.apache.spark.sql.sources | Apache Spark |
|
IsotonicRegression | | Class | org.apache.spark.ml.regression | Apache Spark |
|
IsotonicRegression | | Class | org.apache.spark.mllib.regression | Apache Spark |
|
IsotonicRegressionModel | Model fitted by IsotonicRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
IsotonicRegressionModel | Regression model for isotonic regression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
JavaDoubleRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
JavaDStream | A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaDStreamLike | | Interface | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaFutureAction | | Interface | org.apache.spark.api.java | Apache Spark |
|
JavaHadoopRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
JavaInputDStream | A Java-friendly interface to InputDStream. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaIterableWrapperSerializer | A Kryo serializer for serializing results returned by asJavaIterable. | Class | org.apache.spark.serializer | Apache Spark |
|
JavaMapWithStateDStream | DStream representing the stream of data generated by mapWithState operation on a JavaPairDStream. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaNewHadoopRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
JavaPairDStream | A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaPairInputDStream | A Java-friendly interface to InputDStream ofSee Also:Serialized Form | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaPairRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
JavaPairReceiverInputDStream | A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaParams | Java-friendly wrapper for Params. | Class | org.apache.spark.ml.param | Apache Spark |
|
JavaRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
JavaRDDLike | Defines operations common to several Java RDD implementations. | Interface | org.apache.spark.api.java | Apache Spark |
|
JavaReceiverInputDStream | A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JavaSerializer | A Spark serializer that uses Java's built-in serialization. | Class | org.apache.spark.serializer | Apache Spark |
|
JavaSparkContext | A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones. | Class | org.apache.spark.api.java | Apache Spark |
|
JavaSparkListener | Java clients should extend this class instead of implementing SparkListener directly. | Class | org.apache.spark | Apache Spark |
|
JavaSparkStatusTracker | Low-level status reporting APIs for monitoring job and stage progress. | Class | org.apache.spark.api.java | Apache Spark |
|
JavaStreamingContext | A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
JdbcDialect | Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
JdbcDialects | Registry of dialects that apply to every new jdbc DataFrame. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
JdbcRDD | An RDD that executes an SQL query on a JDBC connection and reads results. | Class | org.apache.spark.rdd | Apache Spark |
|
JdbcType | A database type definition coupled with the jdbc type needed to send null values to the database. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
JobData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
JobExecutionStatus | enum JobExecutionStatusEnum Constant Summary | Class | org.apache.spark | Apache Spark |
|
JobLogger | A logger class to record runtime information for jobs in Spark. | Class | org.apache.spark.scheduler | Apache Spark |
|
JobProgressListener | Tracks task-level information to be displayed in the UI. | Class | org.apache.spark.ui.jobs | Apache Spark |
|
JobSucceeded | | Class | org.apache.spark.scheduler | Apache Spark |
|
KafkaUtils | | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
KernelDensity | Kernel density estimation. | Class | org.apache.spark.mllib.stat | Apache Spark |
|
KillTask | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
KinesisUtils | | Class | org.apache.spark.streaming.kinesis | Apache Spark |
|
KinesisUtilsPythonHelper | This is a helper class that wraps the methods in KinesisUtils into more Python-friendly class and function so that it can be easily instantiated and called from Python's KinesisUtils. | Class | org.apache.spark.streaming.kinesis | Apache Spark |
|
KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeans | K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-meansalgorithm by Bahmani et al). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
KMeansDataGenerator | Generate test data for KMeans. | Class | org.apache.spark.mllib.util | Apache Spark |
|
KMeansModel | Model fitted by KMeans. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeansModel | A clustering model for K-means. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
KolmogorovSmirnovTestResult | Object containing the test results for the Kolmogorov-Smirnov test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
KryoRegistrator | | Interface | org.apache.spark.serializer | Apache Spark |
|
KryoSerializer | A Spark serializer that uses the Kryo serialization library. | Class | org.apache.spark.serializer | Apache Spark |
|
L1Updater | Updater for L1 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LabelConverter | Label to vector converter. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LabeledPoint | Class that represents the features and labels of a data point. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LabelPropagation | Label Propagation algorithm. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
LassoModel | Regression model trained using Lasso. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LassoWithSGD | Train a regression model with L1-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LBFGS | Class used to solve an optimization problem using Limited-memory BFGS. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
LDAModel | Model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LDAModel | Latent Dirichlet Allocation (LDA) model. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
LDAOptimizer | An LDAOptimizer specifies which optimization/learning/inference algorithm to use, and it can hold optimizer-specific parameters for users to set. | Interface | org.apache.spark.mllib.clustering | Apache Spark |
|
LeafNode | Decision tree leaf node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
LeastSquaresAggregator | LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LeastSquaresCostFun | LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LeastSquaresGradient | Compute gradient and loss for a Least-squared loss function, as used in linear regression. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LessThan | A filter that evaluates to true iff the attribute evaluates to a valueSince:1. | Class | org.apache.spark.sql.sources | Apache Spark |
|
LessThanOrEqual | A filter that evaluates to true iff the attribute evaluates to a value less than or equal to value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
LinearDataGenerator | Generate sample data used for Linear Data. | Class | org.apache.spark.mllib.util | Apache Spark |
|
LinearRegression | The learning objective is to minimize the squared error, with regularization. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionModel | Model produced by LinearRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionModel | Regression model trained using LinearRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LinearRegressionSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionTrainingSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionWithSGD | Train a linear regression model with no regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
Loader | Trait for classes which can load models and transformers from files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
LocalLDAModel | Local (non-distributed) model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LocalLDAModel | This model stores only the inferred topics. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Logging | Utility trait for classes that want to log data. | Interface | org.apache.spark | Apache Spark |
|
LogisticAggregator | LogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticCostFun | LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticGradient | Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LogisticRegression | Logistic regression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionDataGenerator | Generate test data for LogisticRegression. | Class | org.apache.spark.mllib.util | Apache Spark |
|
LogisticRegressionModel | Model produced by LogisticRegression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionModel | Classification model trained using Multinomial/Binary Logistic Regression. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogisticRegressionSummary | Abstraction for Logistic Regression Results for a given model. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionTrainingSummary | Abstraction for multinomial Logistic Regression Training results. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionWithLBFGS | Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogisticRegressionWithSGD | Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogLoss | Class for log loss calculation (for classification). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
LogNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
LongParam | Specialized version of Param[Long] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
LongType | The data type representing Long values. | Class | org.apache.spark.sql.types | Apache Spark |
|
Loss | Trait for adding "pluggable" loss functions for the gradient boosting algorithm. | Interface | org.apache.spark.mllib.tree.loss | Apache Spark |
|
Losses | | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
LZ4CompressionCodec | LZ4 implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
LZFCompressionCodec | LZF implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
MapFunction | Base interface for a map function used in Dataset's map function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
MapGroupsFunction | Base interface for a map function used in GroupedDataset's mapGroup function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
MapPartitionsFunction | Base interface for function used in Dataset's mapPartitions. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
MapType | The data type for Maps. | Class | org.apache.spark.sql.types | Apache Spark |
|
MapWithStateDStream | DStream representing the stream of data generated by mapWithState operation on a Additionally, it also gives access to the stream of state snapshots, that is, the state data of | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
Matrices | Factory methods for Matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
Matrix | Trait for a local matrix. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
MatrixEntry | Represents an entry in an distributed matrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
MatrixFactorizationModel | Model representing the result of matrix factorization. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
MemoryEntry | | Class | org.apache.spark.storage | Apache Spark |
|
Metadata | Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean, Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and | Class | org.apache.spark.sql.types | Apache Spark |
|
MetadataBuilder | Builder for Metadata. | Class | org.apache.spark.sql.types | Apache Spark |
|
MethodIdentifier | Helper class to identify a method. | Class | org.apache.spark.util | Apache Spark |
|
MFDataGenerator | Generate RDD(s) containing data for Matrix Factorization. | Class | org.apache.spark.mllib.util | Apache Spark |
|
Milliseconds | Helper object that creates instance of Duration representing a given number of milliseconds. | Class | org.apache.spark.streaming | Apache Spark |
|
MinMaxScaler | Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling. | Class | org.apache.spark.ml.feature | Apache Spark |
|
MinMaxScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
Minutes | Helper object that creates instance of Duration representing a given number of minutes. | Class | org.apache.spark.streaming | Apache Spark |
|
MLPairRDDFunctions | Machine learning specific Pair RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
MLReadable | Trait for objects that provide MLReader. | Interface | org.apache.spark.ml.util | Apache Spark |
|
MLReader | Abstract class for utility classes that can load ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
MLUtils | Helper methods to load, save and pre-process data used in ML Lib. | Class | org.apache.spark.mllib.util | Apache Spark |
|
MLWritable | Trait for classes that provide MLWriter. | Interface | org.apache.spark.ml.util | Apache Spark |
|
MLWriter | Abstract class for utility classes that can save ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
Model | A fitted model, i. | Class | org.apache.spark.ml | Apache Spark |
|
MQTTUtils | | Class | org.apache.spark.streaming.mqtt | Apache Spark |
|
MsSqlServerDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
MulticlassClassificationEvaluator | Evaluator for multiclass classification, which expects two input columns: score and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
MulticlassMetrics | Evaluator for multiclass classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
MultilabelMetrics | Evaluator for multilabel classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
MultilayerPerceptronClassificationModel | Classification model based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
MultilayerPerceptronClassifier | Classifier trainer based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
MultivariateGaussian | This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. | Class | org.apache.spark.mllib.stat.distribution | Apache Spark |
|
MultivariateOnlineSummarizer | MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector | Class | org.apache.spark.mllib.stat | Apache Spark |
|
MultivariateStatisticalSummary | Trait for multivariate statistical summary of a data matrix. | Interface | org.apache.spark.mllib.stat | Apache Spark |
|
MutableAggregationBuffer | A Row representing an mutable aggregation buffer. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
MutablePair | A tuple of 2 elements. | Class | org.apache.spark.util | Apache Spark |
|
MySQLDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
NaiveBayes | Naive Bayes Classifiers. | Class | org.apache.spark.ml.classification | Apache Spark |
|
NaiveBayes | | Class | org.apache.spark.mllib.classification | Apache Spark |
|
NaiveBayesModel | Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) | Class | org.apache.spark.ml.classification | Apache Spark |
|
NaiveBayesModel | Model for Naive Bayes Classifiers. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
NarrowDependency | Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD. | Class | org.apache.spark | Apache Spark |
|
NewHadoopRDD | An RDD that provides core functionality for reading data stored in Hadoop (e. | Class | org.apache.spark.rdd | Apache Spark |
|
NGram | A feature transformer that converts the input array of strings into an array of n-grams. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Node | Decision tree node interface. | Class | org.apache.spark.ml.tree | Apache Spark |
|
Node | Node in a decision tree. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
NominalAttribute | A nominal attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
NoopDialect | NOOP dialect object, always returning the neutral element. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
Normalizer | Normalize a vector to have unit norm using the given p-norm. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Normalizer | Normalizes samples individually to unit L^p^ norm For any 1 <= p < Double. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Not | A filter that evaluates to true iff child is evaluated to false. | Class | org.apache.spark.sql.sources | Apache Spark |
|
NullType | The data type representing NULL values. | Class | org.apache.spark.sql.types | Apache Spark |
|
NumericAttribute | A numeric attribute with optional summary statistics. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
NumericType | Numeric data types. | Class | org.apache.spark.sql.types | Apache Spark |
|
OffsetRange | Represents a range of offsets from a single Kafka TopicAndPartition. | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
OneHotEncoder | A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index. | Class | org.apache.spark.ml.feature | Apache Spark |
|
OneToOneDependency | Represents a one-to-one dependency between partitions of the parent and child RDDs. | Class | org.apache.spark | Apache Spark |
|
OneVsRest | Reduction of Multiclass Classification to Binary Classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
OneVsRestModel | Model produced by OneVsRest. | Class | org.apache.spark.ml.classification | Apache Spark |
|
OnlineLDAOptimizer | An online optimizer for LDA. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Optimizer | Trait for optimization problem solvers. | Interface | org.apache.spark.mllib.optimization | Apache Spark |
|
Or | A filter that evaluates to true iff at least one of left or right evaluates to true. | Class | org.apache.spark.sql.sources | Apache Spark |
|
OracleDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
OrderedRDDFunctions | Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
OutputMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
OutputMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
OutputOperationInfo | Class having information on output operations. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
OutputWriter | OutputWriter is used together with HadoopFsRelation for persisting rows to the underlying file system. | Class | org.apache.spark.sql.sources | Apache Spark |
|
OutputWriterFactory | A factory that produces OutputWriters. | Class | org.apache.spark.sql.sources | Apache Spark |
|
PageRank | PageRank algorithm implementation. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
PairDStreamFunctions | Extra functions available on DStream of (key, value) pairs through an implicit conversion. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
PairFlatMapFunction | A function that returns zero or more key-value pair records from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
PairFunction | A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
PairRDDFunctions | Extra functions available on RDDs of (key, value) pairs through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
PairwiseRRDD | Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R. | Class | org.apache.spark.api.r | Apache Spark |
|
Param | A param with self-contained documentation and optionally default value. | Class | org.apache.spark.ml.param | Apache Spark |
|
ParamGridBuilder | Builder for a param grid used in grid search-based model selection. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
ParamMap | A param to value map. | Class | org.apache.spark.ml.param | Apache Spark |
|
ParamPair | A param and its value. | Class | org.apache.spark.ml.param | Apache Spark |
|
Params | | Interface | org.apache.spark.ml.param | Apache Spark |
|
ParamValidators | Factory methods for common validation functions for Param. | Class | org.apache.spark.ml.param | Apache Spark |
|
PartialResult | | Class | org.apache.spark.partial | Apache Spark |
|
Partition | An identifier for a partition in an RDD. | Interface | org.apache.spark | Apache Spark |
|
PartitionCoalescer | Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones. | Class | org.apache.spark.rdd | Apache Spark |
|
Partitioner | An object that defines how the elements in a key-value pair RDD are partitioned by key. | Class | org.apache.spark | Apache Spark |
|
PartitionGroup | | Class | org.apache.spark.rdd | Apache Spark |
|
PartitionPruningRDD | A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions. | Class | org.apache.spark.rdd | Apache Spark |
|
PartitionStrategy | | Interface | org.apache.spark.graphx | Apache Spark |
|
PartitionStrategy .CanonicalRandomVertexCut$ | Assigns edges to partitions by hashing the source and destination vertex IDs in a canonical direction, resulting in a random vertex cut that colocates all edges between two vertices, | Class | org.apache.spark.graphx.PartitionStrategy | Apache Spark |
|
PartitionStrategy .EdgePartition1D$ | Assigns edges to partitions using only the source vertex ID, colocating edges with the sameSee Also:Serialized Form | Class | org.apache.spark.graphx.PartitionStrategy | Apache Spark |
|
PartitionStrategy .EdgePartition2D$ | Assigns edges to partitions using a 2D partitioning of the sparse edge adjacency matrix, guaranteeing a 2 * sqrt(numParts) bound on vertex replication. | Class | org.apache.spark.graphx.PartitionStrategy | Apache Spark |
|
PartitionStrategy .RandomVertexCut$ | Assigns edges to partitions by hashing the source and destination vertex IDs, resulting in a random vertex cut that colocates all same-direction edges between two vertices. | Class | org.apache.spark.graphx.PartitionStrategy | Apache Spark |
|
PCA | PCA trains a model to project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.ml.feature | Apache Spark |
|
PCA | A feature transformer that projects vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
PCAModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
PCAModel | Model fitted by PCA that can project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Pipeline | A simple pipeline, which acts as an estimator. | Class | org.apache.spark.ml | Apache Spark |
|
PipelineModel | Represents a fitted pipeline. | Class | org.apache.spark.ml | Apache Spark |
|
PipelineStage | A stage in a pipeline, either an Estimator or a Transformer. | Class | org.apache.spark.ml | Apache Spark |
|
PMMLExportable | Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format | Interface | org.apache.spark.mllib.pmml | Apache Spark |
|
PoissonGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
PoissonSampler | A sampler for sampling with replacement, based on values drawn from Poisson distribution. | Class | org.apache.spark.util.random | Apache Spark |
|
PolynomialExpansion | Perform feature expansion in a polynomial space. | Class | org.apache.spark.ml.feature | Apache Spark |
|
PortableDataStream | A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read | Class | org.apache.spark.input | Apache Spark |
|
PostgresDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
PowerIterationClustering | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
PowerIterationClustering .Assignment | param: cluster assigned cluster idSee Also:Serialized Form | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
PowerIterationClustering .Assignment$ | | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
PowerIterationClusteringModel | Model produced by PowerIterationClustering. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
PrecisionInfo | Precision parameters for a DecimalSee Also:Serialized Form | Class | org.apache.spark.sql.types | Apache Spark |
|
Predict | Predicted value for a node param: predict predicted value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
PredictionModel | Abstraction for a model for prediction tasks (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
Predictor | Abstraction for prediction problems (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
PrefixSpan | A parallel PrefixSpan algorithm to mine frequent sequential patterns. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
PrefixSpan .FreqSequence | Represents a frequence sequence. | Class | org.apache.spark.mllib.fpm.PrefixSpan | Apache Spark |
|
PrefixSpanModel | Model fitted by PrefixSpan param: freqSequences frequent sequences | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
Pregel | Unlike the original Pregel API, the GraphX Pregel API factors the sendMessage computation over edges, enables the message sending computation to read both vertex attributes, and constrains | Class | org.apache.spark.graphx | Apache Spark |
|
Private | A class that is considered private to the internals of Spark -- there is a high-likelihood they will be changed in future versions of Spark. | Class | org.apache.spark.annotation | Apache Spark |
|
ProbabilisticClassificationModel | Model produced by a ProbabilisticClassifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
ProbabilisticClassifier | Single-label binary or multiclass classifier which can output class conditional probabilities. | Class | org.apache.spark.ml.classification | Apache Spark |
|
PrunedFilteredScan | A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
PrunedScan | A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
QRDecomposition | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
QuantileDiscretizer | QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features. | Class | org.apache.spark.ml.feature | Apache Spark |
|
QuantileStrategy | Enum for selecting the quantile calculation strategySee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
QueryExecutionListener | The interface of query execution listener that can be used to analyze execution metrics. | Interface | org.apache.spark.sql.util | Apache Spark |
|
RandomDataGenerator | Trait for random data generators that generate i. | Interface | org.apache.spark.mllib.random | Apache Spark |
|
RandomForest | A class that implements a Random Forest learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
RandomForestClassificationModel | Random Forest model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
RandomForestClassifier | Random Forest learning algorithm for It supports both binary and multiclass labels, as well as both continuous and categorical | Class | org.apache.spark.ml.classification | Apache Spark |
|
RandomForestModel | Represents a random forest model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
RandomForestRegressionModel | Random Forest model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RandomForestRegressor | Random Forest learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RandomRDDs | Generator methods for creating RDDs comprised of i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
RandomSampler | A pseudorandom sampler. | Interface | org.apache.spark.util.random | Apache Spark |
|
RangeDependency | Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs. | Class | org.apache.spark | Apache Spark |
|
RangePartitioner | A Partitioner that partitions sortable records by range into roughly equal ranges. | Class | org.apache.spark | Apache Spark |
|
RankingMetrics | Evaluator for ranking algorithms. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
Rating | A more compact class to represent a rating than Tuple3[Int, Int, Double]. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
RDD | A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. | Class | org.apache.spark.rdd | Apache Spark |
|
RDDBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
RDDDataDistribution | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
RDDFunctions | Machine learning specific RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
RDDInfo | | Class | org.apache.spark.storage | Apache Spark |
|
RDDPartitionInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
RDDStorageInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
Receiver | Abstract class of a receiver that can be run on worker nodes to receive external data. | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
ReceiverInfo | Class having information about a receiverSee Also:Serialized Form | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
ReceiverInputDStream | Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
ReduceFunction | Base interface for function used in Dataset's reduce. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
RegexTokenizer | A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false). | Class | org.apache.spark.ml.feature | Apache Spark |
|
RegressionEvaluator | Evaluator for regression, which expects two input columns: prediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
RegressionMetrics | Evaluator for regression. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
RegressionModel | Model produced by a Regressor. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RegressionModel | | Interface | org.apache.spark.mllib.regression | Apache Spark |
|
RelationProvider | Implemented by objects that produce relations for a specific kind of data source. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
Resubmitted | A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed. | Class | org.apache.spark | Apache Spark |
|
ReturnStatementFinder | | Class | org.apache.spark.util | Apache Spark |
|
ReviveOffers | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
RFormula | Implements the transforms required for fitting a dataset against an R model formula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
RFormulaModel | A fitted RFormula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
RidgeRegressionModel | Regression model trained using RidgeRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
RidgeRegressionWithSGD | Train a regression model with L2-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
Row | Represents one row of output from a relational operator. | Interface | org.apache.spark.sql | Apache Spark |
|
RowFactory | A factory class used to construct Row objects. | Class | org.apache.spark.sql | Apache Spark |
|
RowMatrix | Represents a row-oriented distributed Matrix with no meaningful row indices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
RpcUtils | | Class | org.apache.spark.util | Apache Spark |
|
RRDD | An RDD that stores serialized R objects as Array[Byte]. | Class | org.apache.spark.api.r | Apache Spark |
|
RuntimePercentage | | Class | org.apache.spark.scheduler | Apache Spark |
|
Saveable | Trait for models and transformers which may be saved as files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
SaveMode | SaveMode is used to specify the expected behavior of saving a DataFrame to a data source. | Class | org.apache.spark.sql | Apache Spark |
|
SchedulingMode | "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues | Class | org.apache.spark.scheduler | Apache Spark |
|
SchemaRelationProvider | Implemented by objects that produce relations for a specific kind of data source with a given schema. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
ScriptTransformationWriterThread | | Class | org.apache.spark.sql.hive.execution | Apache Spark |
|
Seconds | Helper object that creates instance of Duration representing a given number of seconds. | Class | org.apache.spark.streaming | Apache Spark |
|
SequenceFileRDDFunctions | Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
SerializableWritable | | Class | org.apache.spark | Apache Spark |
|
SerializationStream | A stream for writing serialized objects. | Class | org.apache.spark.serializer | Apache Spark |
|
Serializer | A serializer. | Class | org.apache.spark.serializer | Apache Spark |
|
SerializerInstance | An instance of a serializer, for use by one thread at a time. | Class | org.apache.spark.serializer | Apache Spark |
|
ShortestPaths | Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
ShortType | The data type representing Short values. | Class | org.apache.spark.sql.types | Apache Spark |
|
ShuffleBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
ShuffleDataBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
ShuffleDependency | Represents a dependency on the output of a shuffle stage. | Class | org.apache.spark | Apache Spark |
|
ShuffledRDD | The resulting RDD from a shuffle (e. | Class | org.apache.spark.rdd | Apache Spark |
|
ShuffleIndexBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
ShuffleReadMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ShuffleReadMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ShuffleWriteMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
ShuffleWriteMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
SignalLoggerHandler | | Class | org.apache.spark.util | Apache Spark |
|
SimpleFutureAction | A FutureAction holding the result of an action that triggers a single job. | Class | org.apache.spark | Apache Spark |
|
SimpleUpdater | A simple updater for gradient descent *without* any regularization. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
SingularValueDecomposition | Represents singular value decomposition (SVD) factors. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
SizeEstimator | Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches. | Class | org.apache.spark.util | Apache Spark |
|
SnappyCompressionCodec | Snappy implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
SnappyOutputStreamWrapper | Wrapper over SnappyOutputStream which guards against write-after-close and double-close issues. | Class | org.apache.spark.io | Apache Spark |
|
SparkAppHandle | A handle to a running Spark application. | Interface | org.apache.spark.launcher | Apache Spark |
|
SparkAppHandle .Listener | Listener for updates to a handle's state. | Interface | org.apache.spark.launcher.SparkAppHandle | Apache Spark |
|
SparkAppHandle .State | Represents the application's state. | Class | org.apache.spark.launcher.SparkAppHandle | Apache Spark |
|
SparkConf | Configuration for a Spark application. | Class | org.apache.spark | Apache Spark |
|
SparkContext | Main entry point for Spark functionality. | Class | org.apache.spark | Apache Spark |
|
SparkContext .DoubleAccumulatorParam$ | | Class | org.apache.spark.SparkContext | Apache Spark |
|
SparkContext .FloatAccumulatorParam$ | | Class | org.apache.spark.SparkContext | Apache Spark |
|
SparkContext .IntAccumulatorParam$ | | Class | org.apache.spark.SparkContext | Apache Spark |
|
SparkContext .LongAccumulatorParam$ | | Class | org.apache.spark.SparkContext | Apache Spark |
|
SparkEnv | Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc. | Class | org.apache.spark | Apache Spark |
|
SparkException | | Class | org.apache.spark | Apache Spark |
|
SparkFiles | Resolves paths to files added through SparkContext. | Class | org.apache.spark | Apache Spark |
|
SparkFirehoseListener | Class that allows users to receive all SparkListener events. | Class | org.apache.spark | Apache Spark |
|
SparkFlumeEvent | A wrapper class for AvroFlumeEvent's with a custom serialization format. | Class | org.apache.spark.streaming.flume | Apache Spark |
|
SparkJobInfo | Exposes information about Spark Jobs. | Interface | org.apache.spark | Apache Spark |
|
SparkJobInfoImpl | | Class | org.apache.spark | Apache Spark |
|
SparkLauncher | Launcher for Spark applications. | Class | org.apache.spark.launcher | Apache Spark |
|
SparkListener | Interface for listening to events from the Spark scheduler. | Interface | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerApplicationEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerApplicationStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerBlockManagerAdded | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerBlockManagerRemoved | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerBlockUpdated | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerEnvironmentUpdate | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerExecutorAdded | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerExecutorMetricsUpdate | Periodic updates from executors. | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerExecutorRemoved | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerJobEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerJobStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerStageCompleted | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerStageSubmitted | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerTaskEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerTaskGettingResult | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerTaskStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkListenerUnpersistRDD | | Class | org.apache.spark.scheduler | Apache Spark |
|
SparkMasterRegex | A collection of regexes for extracting information from the master string. | Class | org.apache.spark | Apache Spark |
|
SparkShutdownHook | | Class | org.apache.spark.util | Apache Spark |
|
SparkStageInfo | Exposes information about Spark Stages. | Interface | org.apache.spark | Apache Spark |
|
SparkStageInfoImpl | | Class | org.apache.spark | Apache Spark |
|
SparkStatusTracker | Low-level status reporting APIs for monitoring job and stage progress. | Class | org.apache.spark | Apache Spark |
|
SparseMatrix | Column-major sparse matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
SparseVector | A sparse vector represented by an index array and an value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
SpecialLengths | | Class | org.apache.spark.api.r | Apache Spark |
|
SpillListener | A SparkListener that detects whether spills have occurred in Spark jobs. | Class | org.apache.spark | Apache Spark |
|
Split | Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path. | Interface | org.apache.spark.ml.tree | Apache Spark |
|
Split | Split applied to a feature param: feature feature index | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
SplitInfo | | Class | org.apache.spark.scheduler | Apache Spark |
|
SQLContext | The entry point for working with structured data (rows and columns) in Spark. | Class | org.apache.spark.sql | Apache Spark |
|
SQLImplicits | A collection of implicit methods for converting common Scala objects into DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
SQLTransformer | Implements the transformations which are defined by SQL statement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
SQLUserDefinedType | A user-defined type which can be automatically recognized by a SQLContext and registered. | Class | org.apache.spark.sql.types | Apache Spark |
|
SquaredError | Class for squared error loss calculation. | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
SquaredL2Updater | Updater for L2 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
StageData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
StageInfo | Stores information about a stage to pass from the scheduler to SparkListeners. | Class | org.apache.spark.scheduler | Apache Spark |
|
StageStatus | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
StandardNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
StandardScaler | Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StandardScaler | Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
StandardScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
StandardScalerModel | Represents a StandardScaler model that can transform vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
StatCounter | A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way. | Class | org.apache.spark.util | Apache Spark |
|
State | Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) | Class | org.apache.spark.streaming | Apache Spark |
|
StateSpec | Abstract class representing all the specifications of the DStream transformation mapWithState operation of a | Class | org.apache.spark.streaming | Apache Spark |
|
Statistics | | Class | org.apache.spark.mllib.stat | Apache Spark |
|
Statistics | Statistics for querying the supervisor about state of workers. | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
StatsReportListener | | Class | org.apache.spark.scheduler | Apache Spark |
|
StatsReportListener | A simple StreamingListener that logs summary statistics across Spark Streaming batches param: numBatchInfos Number of last batches to consider for generating statistics (default: 10) | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StatusUpdate | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
StopCoordinator | | Class | org.apache.spark.scheduler | Apache Spark |
|
StopExecutor | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
StopWordsRemover | A feature transformer that filters out stop words from input. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StorageLevel | Flags for controlling the storage of an RDD. | Class | org.apache.spark.storage | Apache Spark |
|
StorageLevels | Expose some commonly useful storage level constants. | Class | org.apache.spark.api.java | Apache Spark |
|
StorageListener | A SparkListener that prepares information to be displayed on the BlockManagerUI. | Class | org.apache.spark.ui.storage | Apache Spark |
|
StorageStatus | Storage information for each BlockManager. | Class | org.apache.spark.storage | Apache Spark |
|
StorageStatusListener | A SparkListener that maintains executor storage status. | Class | org.apache.spark.storage | Apache Spark |
|
Strategy | Stores all the configuration options for tree construction param: algo Learning goal. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
StreamBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
StreamingContext | Main entry point for Spark Streaming functionality. | Class | org.apache.spark.streaming | Apache Spark |
|
StreamingContextState | enum StreamingContextState Represents the state of a StreamingContext. | Class | org.apache.spark.streaming | Apache Spark |
|
StreamingKMeans | StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
StreamingKMeansModel | StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
StreamingLinearAlgorithm | StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, | Class | org.apache.spark.mllib.regression | Apache Spark |
|
StreamingLinearRegressionWithSGD | Train or predict a linear regression model on streaming data. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
StreamingListener | | Interface | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerBatchCompleted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerBatchStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerBatchSubmitted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerOutputOperationCompleted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerOutputOperationStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerReceiverError | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerReceiverStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingListenerReceiverStopped | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StreamingLogisticRegressionWithSGD | Train or predict a logistic regression model on streaming data. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
StreamingTest | | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
StreamInputInfo | Track the information of input stream at specified batch time. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
StringArrayParam | Specialized version of Param[Array[String} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
StringContains | A filter that evaluates to true iff the attribute evaluates to a string that contains the string value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
StringEndsWith | A filter that evaluates to true iff the attribute evaluates to a string that starts with value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
StringIndexer | A label indexer that maps a string column of labels to an ML column of label indices. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StringIndexerModel | Model fitted by StringIndexer. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StringRRDD | An RDD that stores R objects as Array[String]. | Class | org.apache.spark.api.r | Apache Spark |
|
StringStartsWith | A filter that evaluates to true iff the attribute evaluates to a string that starts with value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
StringType | The data type representing String values. | Class | org.apache.spark.sql.types | Apache Spark |
|
StronglyConnectedComponents | Strongly connected components algorithm implementation. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
StructField | A field inside a StructType. | Class | org.apache.spark.sql.types | Apache Spark |
|
StructType | A StructType object can be constructed by StructType(fields: Seq[StructField]) | Class | org.apache.spark.sql.types | Apache Spark |
|
Success | | Class | org.apache.spark | Apache Spark |
|
SVDPlusPlus | | Class | org.apache.spark.graphx.lib | Apache Spark |
|
SVDPlusPlus .Conf | Configuration parameters for SVDPlusPlus. | Class | org.apache.spark.graphx.lib.SVDPlusPlus | Apache Spark |
|
SVMDataGenerator | Generate sample data used for SVM. | Class | org.apache.spark.mllib.util | Apache Spark |
|
SVMModel | Model for Support Vector Machines (SVMs). | Class | org.apache.spark.mllib.classification | Apache Spark |
|
SVMWithSGD | Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
TaskCommitDenied | Task requested the driver to commit, but was denied. | Class | org.apache.spark | Apache Spark |
|
TaskCompletionListener | Listener providing a callback function to invoke when a task's execution completes. | Interface | org.apache.spark.util | Apache Spark |
|
TaskContext | Contextual information about a task which can be read or mutated during execution. | Class | org.apache.spark | Apache Spark |
|
TaskData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
TaskFailedReason | Various possible reasons why a task failed. | Interface | org.apache.spark | Apache Spark |
|
TaskInfo | Information about a running task attempt inside a TaskSet. | Class | org.apache.spark.scheduler | Apache Spark |
|
TaskKilled | Task was killed intentionally and needs to be rescheduled. | Class | org.apache.spark | Apache Spark |
|
TaskKilledException | Exception thrown when a task is explicitly killed (i. | Class | org.apache.spark | Apache Spark |
|
TaskLocality | | Class | org.apache.spark.scheduler | Apache Spark |
|
TaskMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
TaskMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
TaskResultBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
TaskResultLost | The task finished successfully, but the result was lost from the executor's block manager beforeSee Also:Serialized Form | Class | org.apache.spark | Apache Spark |
|
TaskSorting | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
TestResult | Trait for hypothesis test results. | Interface | org.apache.spark.mllib.stat.test | Apache Spark |
|
Time | This is a simple class that represents an absolute instant of time. | Class | org.apache.spark.streaming | Apache Spark |
|
TimestampType | The data type representing java. | Class | org.apache.spark.sql.types | Apache Spark |
|
TimeTrackingOutputStream | Intercepts write calls and tracks total time spent writing in order to update shuffle write metrics. | Class | org.apache.spark.storage | Apache Spark |
|
Tokenizer | A tokenizer that converts the input string to lowercase and then splits it by white spaces. | Class | org.apache.spark.ml.feature | Apache Spark |
|
TorrentBroadcastFactory | A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors. | Class | org.apache.spark.broadcast | Apache Spark |
|
TrainValidationSplit | Validation for hyper-parameter tuning. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
TrainValidationSplitModel | Model from train validation split. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
Transformer | Abstract class for transformers that transform one dataset into another. | Class | org.apache.spark.ml | Apache Spark |
|
TriangleCount | Compute the number of triangles passing through each vertex. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
TripletFields | Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]]. | Class | org.apache.spark.graphx | Apache Spark |
|
TwitterUtils | | Class | org.apache.spark.streaming.twitter | Apache Spark |
|
TypedColumn | A Column where an Encoder has been given for the expected input and return type. | Class | org.apache.spark.sql | Apache Spark |
|
UDF10 | A Spark SQL UDF that has 10 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF11 | A Spark SQL UDF that has 11 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF12 | A Spark SQL UDF that has 12 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF13 | A Spark SQL UDF that has 13 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF14 | A Spark SQL UDF that has 14 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF15 | A Spark SQL UDF that has 15 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF16 | A Spark SQL UDF that has 16 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF17 | A Spark SQL UDF that has 17 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF18 | A Spark SQL UDF that has 18 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF19 | A Spark SQL UDF that has 19 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF20 | A Spark SQL UDF that has 20 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF21 | A Spark SQL UDF that has 21 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF22 | A Spark SQL UDF that has 22 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF3 | A Spark SQL UDF that has 3 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF4 | A Spark SQL UDF that has 4 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF5 | A Spark SQL UDF that has 5 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF6 | A Spark SQL UDF that has 6 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF7 | A Spark SQL UDF that has 7 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF8 | A Spark SQL UDF that has 8 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDF9 | A Spark SQL UDF that has 9 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
UDFRegistration | Functions for registering user-defined functions. | Class | org.apache.spark.sql | Apache Spark |
|
UnaryTransformer | Abstract class for transformers that take one input column, apply transformation, and output the result as a new column. | Class | org.apache.spark.ml | Apache Spark |
|
UniformGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
UnionRDD | | Class | org.apache.spark.rdd | Apache Spark |
|
UnknownReason | We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result. | Class | org.apache.spark | Apache Spark |
|
UnresolvedAttribute | An unresolved attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
Updater | Class used to perform steps (weight update) using Gradient Descent methods. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
UserDefinedAggregateFunction | The base class for implementing user-defined aggregate functions (UDAF). | Class | org.apache.spark.sql.expressions | Apache Spark |
|
UserDefinedFunction | A user-defined function. | Class | org.apache.spark.sql | Apache Spark |
|
UserDefinedType | The data type for User Defined Types (UDTs). | Class | org.apache.spark.sql.types | Apache Spark |
|
Variance | Class for calculating variance during regressionSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
Vector | Represents a numeric vector, whose index type is Int and value type is Double. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
Vector | | Class | org.apache.spark.util | Apache Spark |
|
Vector .Multiplier | | Class | org.apache.spark.util.Vector | Apache Spark |
|
Vector .VectorAccumParam$ | | Class | org.apache.spark.util.Vector | Apache Spark |
|
VectorAssembler | A feature transformer that merges multiple columns into a vector column. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorAttributeRewriter | Utility transformer that rewrites Vector attribute names via prefix replacement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorIndexer | Class for indexing categorical feature columns in a dataset of Vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorIndexerModel | Transform categorical features to use 0-based indices instead of their original values. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Vectors | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
VectorSlicer | This class takes a feature vector and outputs a new feature vector with a subarray of the The subset of features can be specified with either indices (setIndices()) | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorTransformer | | Interface | org.apache.spark.mllib.feature | Apache Spark |
|
VectorUDT | :: AlphaComponent :: User-defined type for Vector which allows easy interaction with SQL | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
VertexRDD | pre-indexing the entries for fast, efficient joins. | Class | org.apache.spark.graphx | Apache Spark |
|
VertexRDDImpl | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
VocabWord | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
VoidFunction2 | A two-argument function that takes arguments of type T1 and T2 with no return value. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
WeibullGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
Window | Utility functions for defining window in DataFrames. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
WindowSpec | A window specification that defines the partitioning, ordering, and frame boundaries. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
Word2Vec | Word2Vec trains a model of Map(String, Vector), i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Word2Vec | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Word2VecModel | Model fitted by Word2Vec. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Word2VecModel | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
WriteAheadLog | This abstract class represents a write ahead log (aka journal) that is used by Spark Streaming to save the received data (by receivers) and associated metadata to a reliable storage, so that | Class | org.apache.spark.streaming.util | Apache Spark |
|
WriteAheadLogRecordHandle | This abstract class represents a handle that refers to a record written in a It must contain all the information necessary for the record to be read and returned by | Class | org.apache.spark.streaming.util | Apache Spark |
|
ZeroMQUtils | | Class | org.apache.spark.streaming.zeromq | Apache Spark |