| Name | Description | Type | Package | Framework |
| AbsoluteError | Class for absolute error loss calculation (for regression). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| Accumulable | A data type that can be accumulated, ie has an commutative and associative "add" operation, but where the result type, R, may be different from the element type being added, T. | Class | org.apache.spark | Apache Spark |
|
| AccumulableInfo | Information about an Accumulable modified during a task or stage. | Class | org.apache.spark.scheduler | Apache Spark |
|
| AccumulableInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| AccumulableParam | Helper object defining how to accumulate values of a particular type. | Interface | org.apache.spark | Apache Spark |
|
| Accumulator | A simpler value of Accumulable where the result type being accumulated is the same as the types of elements being merged, i. | Class | org.apache.spark | Apache Spark |
|
| AccumulatorParam | A simpler version of AccumulableParam where the only data type you can add in is the same type as the accumulated value. | Interface | org.apache.spark | Apache Spark |
|
| ActorHelper | A receiver trait to be mixed in with your Actor to gain access to the API for pushing received data into Spark Streaming for being processed. | Interface | org.apache.spark.streaming.receiver | Apache Spark |
|
| ActorSupervisorStrategy | | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
| AFTAggregator | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTCostFun | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTSurvivalRegression | Fit a parametric survival regression model named accelerated failure time (AFT) model (https://en. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTSurvivalRegressionModel | Model produced by AFTSurvivalRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AggregatedDialect | AggregatedDialect can unify multiple dialects into one virtual Dialect. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| AggregatingEdgeContext | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
| Aggregator | A set of functions used to aggregate data. | Class | org.apache.spark | Apache Spark |
|
| Aggregator | A base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
| Algo | Enum to select the algorithm for the decision treeSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| AlphaComponent | A new component of Spark which may have unstable API's. | Class | org.apache.spark.annotation | Apache Spark |
|
| ALS | Alternating Least Squares (ALS) matrix factorization. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
| ALS | | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| ALSModel | Model fitted by ALS. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
| AnalysisException | Thrown when a query fails to analyze, usually because the query itself is invalid. | Class | org.apache.spark.sql | Apache Spark |
|
| And | A filter that evaluates to true iff both left or right evaluate to true. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| ApplicationAttemptInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ApplicationInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ApplicationStatus | enum ApplicationStatusEnum Constant Summary | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ArrayType | | Class | org.apache.spark.sql.types | Apache Spark |
|
| AskPermissionToCommitOutput | | Class | org.apache.spark.scheduler | Apache Spark |
|
| AssociationRules | Generates association rules from a RDD[FreqItemset[Item]. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| AsyncRDDActions | A set of asynchronous RDD actions available through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
| Attribute | Abstract class for ML attributes. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| AttributeGroup | Attributes that describe a vector ML column. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| AttributeType | An enum-like type for attribute types: AttributeType$. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| BaseRelation | Represents a collection of tuples with a known schema. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| BaseRRDD | | Class | org.apache.spark.api.r | Apache Spark |
|
| BatchInfo | Class having information on completed batches. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| BernoulliCellSampler | A sampler based on Bernoulli trials for partitioning a data sequence. | Class | org.apache.spark.util.random | Apache Spark |
|
| BernoulliSampler | A sampler based on Bernoulli trials. | Class | org.apache.spark.util.random | Apache Spark |
|
| Binarizer | Binarize a column of continuous features given a threshold. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| BinaryAttribute | A binary attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| BinaryClassificationEvaluator | Evaluator for binary classification, which expects two input columns: rawPrediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| BinaryClassificationMetrics | Evaluator for binary classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| BinaryLogisticRegressionSummary | Binary Logistic regression results for a given model. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| BinaryLogisticRegressionTrainingSummary | Logistic regression training results. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| BinarySample | Class that represents the group and value of a sample. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| BinaryType | The data type representing Array[Byte] values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| BisectingKMeans | A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| BisectingKMeansModel | Clustering model produced by BisectingKMeans. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| BlockId | Identifies a particular Block of data, usually associated with a single file. | Class | org.apache.spark.storage | Apache Spark |
|
| BlockManagerId | This class represent an unique identifier for a BlockManager. | Class | org.apache.spark.storage | Apache Spark |
|
| BlockMatrix | Represents a distributed matrix in blocks of local matrices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| BlockNotFoundException | | Class | org.apache.spark.storage | Apache Spark |
|
| BlockStatus | | Class | org.apache.spark.storage | Apache Spark |
|
| BlockUpdatedInfo | Stores information about a block status in a block manager. | Class | org.apache.spark.storage | Apache Spark |
|
| BooleanParam | Specialized version of Param[Boolean] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| BooleanType | The data type representing Boolean values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| BoostingStrategy | Configuration options for GradientBoostedTrees. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| BoundedDouble | A Double value with error bars and associated confidence. | Class | org.apache.spark.partial | Apache Spark |
|
| Broadcast | A broadcast variable. | Class | org.apache.spark.broadcast | Apache Spark |
|
| BroadcastBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| BroadcastFactory | An interface for all the broadcast implementations in Spark (to allow multiple broadcast implementations). | Interface | org.apache.spark.broadcast | Apache Spark |
|
| Broker | Represents the host and port info for a Kafka broker. | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
| Bucketizer | Bucketizer maps a column of continuous features to a column of feature buckets. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| BufferReleasingInputStream | Helper class that ensures a ManagedBuffer is release upon InputStream. | Class | org.apache.spark.storage | Apache Spark |
|
| ByteType | The data type representing Byte values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| CalendarIntervalType | The data type representing calendar time intervals. | Class | org.apache.spark.sql.types | Apache Spark |
|
| CatalystScan | An interface for experimenting with a more direct connection to the query planner. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| CategoricalSplit | Split which tests a categorical feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| ChiSqSelector | Chi-Squared feature selection, which selects categorical features to use for predicting aSee Also:Serialized Form | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ChiSqSelector | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| ChiSqSelectorModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ChiSqSelectorModel | Chi Squared selector model. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| ChiSqTestResult | Object containing the test results for the chi-squared hypothesis test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| ClassificationModel | Model produced by a Classifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| ClassificationModel | Represents a classification model that predicts to which of a set of categories an example belongs. | Interface | org.apache.spark.mllib.classification | Apache Spark |
|
| Classifier | Single-label binary or multiclass classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| CleanAccum | | Class | org.apache.spark | Apache Spark |
|
| CleanBroadcast | | Class | org.apache.spark | Apache Spark |
|
| CleanCheckpoint | | Class | org.apache.spark | Apache Spark |
|
| CleanRDD | | Class | org.apache.spark | Apache Spark |
|
| CleanShuffle | | Class | org.apache.spark | Apache Spark |
|
| CleanupTaskWeakReference | A WeakReference associated with a CleanupTask. | Class | org.apache.spark | Apache Spark |
|
| CoGroupedRDD | A RDD that cogroups its parents. | Class | org.apache.spark.rdd | Apache Spark |
|
| CoGroupFunction | | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| Column | A column that will be computed based on the data in a DataFrame. | Class | org.apache.spark.sql | Apache Spark |
|
| ColumnName | A convenient class used for constructing schema. | Class | org.apache.spark.sql | Apache Spark |
|
| ColumnPruner | Utility transformer for removing temporary columns from a DataFrame. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ComplexFutureAction | A FutureAction for actions that could trigger multiple Spark jobs. | Class | org.apache.spark | Apache Spark |
|
| CompressionCodec | CompressionCodec allows the customization of choosing different compression implementations to be used in block storage. | Interface | org.apache.spark.io | Apache Spark |
|
| ConnectedComponents | Connected components algorithm. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| ConstantInputDStream | An input stream that always returns the same RDD on each timestep. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| ContinuousSplit | Split which tests a continuous feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| CoordinateMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| CountVectorizer | Extracts a vocabulary from document collections and generates a CountVectorizerModel. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| CountVectorizerModel | Converts a text document to a sparse vector of token counts. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| CreatableRelationProvider | | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| CrossValidator | K-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| CrossValidatorModel | Model from k-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| DataFrame | A distributed collection of data organized into named columns. | Class | org.apache.spark.sql | Apache Spark |
|
| DataFrameHolder | A container for a DataFrame, used for implicit conversions. | Class | org.apache.spark.sql | Apache Spark |
|
| DataFrameNaFunctions | Functionality for working with missing data in DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
| DataFrameReader | Interface used to load a DataFrame from external storage systems (e. | Class | org.apache.spark.sql | Apache Spark |
|
| DataFrameStatFunctions | Statistic functions for DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
| DataFrameWriter | Interface used to write a DataFrame to external storage systems (e. | Class | org.apache.spark.sql | Apache Spark |
|
| Dataset | A Dataset is a strongly typed collection of objects that can be transformed in parallel using functional or relational operations. | Class | org.apache.spark.sql | Apache Spark |
|
| DatasetHolder | A container for a Dataset, used for implicit conversions. | Class | org.apache.spark.sql | Apache Spark |
|
| DataSourceRegister | Data sources should implement this trait so that they can register an alias to their data source. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| DataType | The base type of all Spark SQL data types. | Class | org.apache.spark.sql.types | Apache Spark |
|
| DataTypes | To get/create specific data type, users should use singleton objects and factory methods provided by this class. | Class | org.apache.spark.sql.types | Apache Spark |
|
| DataValidators | A collection of methods used to validate data before applying ML algorithms. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| DateType | A date type, supporting "0001-01-01" through "9999-12-31". | Class | org.apache.spark.sql.types | Apache Spark |
|
| DB2Dialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| DCT | A feature transformer that takes the 1D discrete cosine transform of a real vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Decimal | A mutable implementation of BigDecimal that can hold a Long if values are small enough. | Class | org.apache.spark.sql.types | Apache Spark |
|
| DecimalType | | Class | org.apache.spark.sql.types | Apache Spark |
|
| DecisionTree | A class which implements a decision tree learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| DecisionTreeClassificationModel | Decision tree model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| DecisionTreeClassifier | Decision tree learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| DecisionTreeModel | Decision tree model for classification or regression. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| DecisionTreeRegressionModel | Decision tree model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| DecisionTreeRegressor | Decision tree learning algorithm It supports both continuous and categorical features. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| DefaultSource | libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame. | Class | org.apache.spark.ml.source.libsvm | Apache Spark |
|
| DenseMatrix | Column-major dense matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| DenseVector | A dense vector represented by a value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| Dependency | Base class for dependencies. | Class | org.apache.spark | Apache Spark |
|
| DerbyDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| DeserializationStream | A stream for reading serialized objects. | Class | org.apache.spark.serializer | Apache Spark |
|
| DeveloperApi | A lower-level, unstable API intended for developers. | Class | org.apache.spark.annotation | Apache Spark |
|
| DistributedLDAModel | Distributed model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| DistributedLDAModel | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| DistributedMatrix | Represents a distributively stored matrix backed by one or more RDDs. | Interface | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| DoubleArrayParam | Specialized version of Param[Array[Double} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| DoubleFlatMapFunction | A function that returns zero or more records of type Double from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| DoubleFunction | A function that returns Doubles, and can be used to construct DoubleRDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| DoubleParam | Specialized version of Param[Double] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| DoubleRDDFunctions | Extra functions available on RDDs of Doubles through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
| DoubleType | The data type representing Double values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| DStream | A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous sequence of RDDs (of the same type) representing a continuous stream of data (see | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| DummySerializerInstance | Unfortunately, we need a serializer instance in order to construct a DiskBlockObjectWriter. | Class | org.apache.spark.serializer | Apache Spark |
|
| Duration | | Class | org.apache.spark.streaming | Apache Spark |
|
| Durations | | Class | org.apache.spark.streaming | Apache Spark |
|
| Edge | A single directed edge consisting of a source id, target id, and the data associated with the edge. | Class | org.apache.spark.graphx | Apache Spark |
|
| EdgeActiveness | Criteria for filtering edges based on activeness. | Class | org.apache.spark.graphx.impl | Apache Spark |
|
| EdgeContext | Represents an edge along with its neighboring vertices and allows sending messages along the edge. | Class | org.apache.spark.graphx | Apache Spark |
|
| EdgeDirection | The direction of a directed edge relative to a vertex. | Class | org.apache.spark.graphx | Apache Spark |
|
| EdgeRDD | EdgeRDD[ED, VD] extends RDD[Edge[ED} by storing the edges in columnar format on each partition for performance. | Class | org.apache.spark.graphx | Apache Spark |
|
| EdgeRDDImpl | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
| EdgeTriplet | An edge triplet represents an edge along with the vertex attributes of its neighboring vertices. | Class | org.apache.spark.graphx | Apache Spark |
|
| ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| EMLDAOptimizer | Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Encoder | Used to convert a JVM object of type T to and from the internal Spark SQL representation. | Interface | org.apache.spark.sql | Apache Spark |
|
| Encoders | Methods for creating an Encoder. | Class | org.apache.spark.sql | Apache Spark |
|
| Entropy | Class for calculating entropy during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| EnumUtil | | Class | org.apache.spark.util | Apache Spark |
|
| EnvironmentListener | | Class | org.apache.spark.ui.env | Apache Spark |
|
| EqualNullSafe | Performs equality comparison, similar to EqualTo. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| EqualTo | A filter that evaluates to true iff the attribute evaluates to a valueSince:1. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| Estimator | Abstract class for estimators that fit models to data. | Class | org.apache.spark.ml | Apache Spark |
|
| Evaluator | Abstract class for evaluators that compute metrics from predictions. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| ExceptionFailure | Task failed due to a runtime exception. | Class | org.apache.spark | Apache Spark |
|
| ExecutionListenerManager | Manager for QueryExecutionListener. | Class | org.apache.spark.sql.util | Apache Spark |
|
| ExecutorInfo | Stores information about an executor to pass from the scheduler to SparkListeners. | Class | org.apache.spark.scheduler.cluster | Apache Spark |
|
| ExecutorLostFailure | The task failed because the executor that it was running on was lost. | Class | org.apache.spark | Apache Spark |
|
| ExecutorRegistered | | Class | org.apache.spark | Apache Spark |
|
| ExecutorRemoved | | Class | org.apache.spark | Apache Spark |
|
| ExecutorsListener | | Class | org.apache.spark.ui.exec | Apache Spark |
|
| ExecutorStageSummary | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ExecutorSummary | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ExpectationSum | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Experimental | An experimental user-facing API. | Class | org.apache.spark.annotation | Apache Spark |
|
| ExperimentalMethods | Holder for experimental methods for the bravest. | Class | org.apache.spark.sql | Apache Spark |
|
| ExponentialGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| FeatureType | Enum to describe whether a feature is "continuous" or "categorical"See Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| FetchFailed | Task failed to fetch shuffle data from a remote node. | Class | org.apache.spark | Apache Spark |
|
| Filter | A filter predicate for data sources. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| FilterFunction | Base interface for a function used in Dataset's filter function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| FlatMapFunction | A function that returns zero or more output records from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| FlatMapFunction2 | A function that takes two inputs and returns zero or more output records. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| FlatMapGroupsFunction | A function that returns zero or more output records from each grouping key and its values. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| FloatParam | Specialized version of Param[Float] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| FloatType | The data type representing Float values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| FlumeUtils | | Class | org.apache.spark.streaming.flume | Apache Spark |
|
| ForeachFunction | Base interface for a function used in Dataset's foreach function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| ForeachPartitionFunction | Base interface for a function used in Dataset's foreachPartition function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| FPGrowth | A parallel FP-growth algorithm to mine frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| FPGrowthModel | Model trained by FPGrowth, which holds frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| Function | Base interface for functions whose return types do not create special RDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| Function2 | A two-argument function that takes arguments of type T1 and T2 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| Function3 | A three-argument function that takes arguments of type T1, T2 and T3 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| Function4 | A four-argument function that takes arguments of type T1, T2, T3 and T4 and returns an R. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| functions | | Class | org.apache.spark.sql | Apache Spark |
|
| FutureAction | A future for the result of an action to support cancellation. | Interface | org.apache.spark | Apache Spark |
|
| GammaGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| GaussianMixture | This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| GaussianMixtureModel | Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| GBTClassificationModel | Gradient-Boosted Trees (GBTs) model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| GBTClassifier | Gradient-Boosted Trees (GBTs) learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| GBTRegressionModel | Gradient-Boosted Trees (GBTs) model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| GBTRegressor | Gradient-Boosted Trees (GBTs) learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| GeneralizedLinearAlgorithm | GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM). | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| GeneralizedLinearModel | GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| Gini | Class for calculating the during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| Gradient | Class used to compute the gradient for a loss function, given a single data point. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| GradientBoostedTrees | A class that implements Stochastic Gradient Boosting | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| GradientBoostedTreesModel | Represents a gradient boosted trees model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| GradientDescent | Class used to solve an optimization problem using Gradient Descent. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| Graph | The Graph abstractly represents a graph with arbitrary objects associated with vertices and edges. | Class | org.apache.spark.graphx | Apache Spark |
|
| GraphGenerators | A collection of graph generating functions. | Class | org.apache.spark.graphx.util | Apache Spark |
|
| GraphImpl | An implementation of Graph to support computation on graphs. | Class | org.apache.spark.graphx.impl | Apache Spark |
|
| GraphKryoRegistrator | Registers GraphX classes with Kryo for improved performance. | Class | org.apache.spark.graphx | Apache Spark |
|
| GraphLoader | Provides utilities for loading Graphs from files. | Class | org.apache.spark.graphx | Apache Spark |
|
| GraphOps | Contains additional functionality for Graph. | Class | org.apache.spark.graphx | Apache Spark |
|
| GraphXUtils | | Class | org.apache.spark.graphx | Apache Spark |
|
| GreaterThan | A filter that evaluates to true iff the attribute evaluates to a value greater than value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| GreaterThanOrEqual | A filter that evaluates to true iff the attribute evaluates to a value greater than or equal to value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| GroupedData | A set of methods for aggregations on a DataFrame, created by DataFrame. | Class | org.apache.spark.sql | Apache Spark |
|
| GroupedDataset | A Dataset has been logically grouped by a user specified grouping key. | Class | org.apache.spark.sql | Apache Spark |
|
| HadoopFsRelation | A BaseRelation that provides much of the common code required for relations that store their data to an HDFS compatible filesystem. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| HadoopFsRelationProvider | Implemented by objects that produce relations for a specific kind of data source with a given schema and partitioned columns. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| HadoopRDD | An RDD that provides core functionality for reading data stored in Hadoop (e. | Class | org.apache.spark.rdd | Apache Spark |
|
| HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| HashPartitioner | A Partitioner that implements hash-based partitioning using Java's Object. | Class | org.apache.spark | Apache Spark |
|
| HasOffsetRanges | Represents any object that has a collection of OffsetRanges. | Interface | org.apache.spark.streaming.kafka | Apache Spark |
|
| HingeGradient | Compute gradient and loss for a Hinge loss function, as used in SVM binary classification. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| HiveContext | An instance of the Spark SQL execution engine that integrates with data stored in Hive. | Class | org.apache.spark.sql.hive | Apache Spark |
|
| HttpBroadcastFactory | A BroadcastFactory implementation that uses a HTTP server as the broadcast mechanism. | Class | org.apache.spark.broadcast | Apache Spark |
|
| Identifiable | Trait for an object with an immutable unique ID that identifies itself and its derivatives. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| IDF | Compute the Inverse Document Frequency (IDF) given a collection of documents. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| IDF | Inverse document frequency (IDF). | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| IDFModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| IDFModel | Represents an IDF model that can transform term frequency vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Impurity | Trait for calculating information gain. | Interface | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| In | A filter that evaluates to true iff the attribute evaluates to one of the values in the array. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| IndexedRow | Represents a row of IndexedRowMatrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| IndexedRowMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| IndexToString | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| InformationGainStats | Information gain statistics for each split param: gain information gain value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| InnerClosureFinder | | Class | org.apache.spark.util | Apache Spark |
|
| InputDStream | This is the abstract base class for all input streams. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| InputFormatInfo | Parses and holds information about inputFormat (and files) specified as a parameter. | Class | org.apache.spark.scheduler | Apache Spark |
|
| InputMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| InputMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| InsertableRelation | A BaseRelation that can be used to insert data into it through the insert method. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| IntArrayParam | Specialized version of Param[Array[Int} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| IntegerType | The data type representing Int values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| Interaction | Implements the feature interaction transform. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| InternalNode | Internal Decision Tree node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| InterruptibleIterator | An iterator that wraps around an existing iterator to provide task killing functionality. | Class | org.apache.spark | Apache Spark |
|
| IntParam | Specialized version of Param[Int] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| IsNotNull | A filter that evaluates to true iff the attribute evaluates to a non-null value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| IsNull | A filter that evaluates to true iff the attribute evaluates to null. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| IsotonicRegression | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| IsotonicRegression | | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| IsotonicRegressionModel | Model fitted by IsotonicRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| IsotonicRegressionModel | Regression model for isotonic regression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| JavaDoubleRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaDStream | A Java-friendly interface to DStream, the basic abstraction in Spark Streaming that represents a continuous stream of data. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaDStreamLike | | Interface | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaFutureAction | | Interface | org.apache.spark.api.java | Apache Spark |
|
| JavaHadoopRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaInputDStream | A Java-friendly interface to InputDStream. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaIterableWrapperSerializer | A Kryo serializer for serializing results returned by asJavaIterable. | Class | org.apache.spark.serializer | Apache Spark |
|
| JavaMapWithStateDStream | DStream representing the stream of data generated by mapWithState operation on a JavaPairDStream. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaNewHadoopRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaPairDStream | A Java-friendly interface to a DStream of key-value pairs, which provides extra methods like reduceByKey and join. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaPairInputDStream | A Java-friendly interface to InputDStream ofSee Also:Serialized Form | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaPairRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaPairReceiverInputDStream | A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaParams | Java-friendly wrapper for Params. | Class | org.apache.spark.ml.param | Apache Spark |
|
| JavaRDD | | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaRDDLike | Defines operations common to several Java RDD implementations. | Interface | org.apache.spark.api.java | Apache Spark |
|
| JavaReceiverInputDStream | A Java-friendly interface to ReceiverInputDStream, the abstract class for defining any input stream that receives data over the network. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JavaSerializer | A Spark serializer that uses Java's built-in serialization. | Class | org.apache.spark.serializer | Apache Spark |
|
| JavaSparkContext | A Java-friendly version of SparkContext that returns JavaRDDs and works with Java collections instead of Scala ones. | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaSparkListener | Java clients should extend this class instead of implementing SparkListener directly. | Class | org.apache.spark | Apache Spark |
|
| JavaSparkStatusTracker | Low-level status reporting APIs for monitoring job and stage progress. | Class | org.apache.spark.api.java | Apache Spark |
|
| JavaStreamingContext | A Java-friendly version of StreamingContext which is the main entry point for Spark Streaming functionality. | Class | org.apache.spark.streaming.api.java | Apache Spark |
|
| JdbcDialect | Encapsulates everything (extensions, workarounds, quirks) to handle the SQL dialect of a certain database or jdbc driver. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| JdbcDialects | Registry of dialects that apply to every new jdbc DataFrame. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| JdbcRDD | An RDD that executes an SQL query on a JDBC connection and reads results. | Class | org.apache.spark.rdd | Apache Spark |
|
| JdbcType | A database type definition coupled with the jdbc type needed to send null values to the database. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| JobData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| JobExecutionStatus | enum JobExecutionStatusEnum Constant Summary | Class | org.apache.spark | Apache Spark |
|
| JobLogger | A logger class to record runtime information for jobs in Spark. | Class | org.apache.spark.scheduler | Apache Spark |
|
| JobProgressListener | Tracks task-level information to be displayed in the UI. | Class | org.apache.spark.ui.jobs | Apache Spark |
|
| JobSucceeded | | Class | org.apache.spark.scheduler | Apache Spark |
|
| KafkaUtils | | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
| KernelDensity | Kernel density estimation. | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| KillTask | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
| KinesisUtils | | Class | org.apache.spark.streaming.kinesis | Apache Spark |
|
| KinesisUtilsPythonHelper | This is a helper class that wraps the methods in KinesisUtils into more Python-friendly class and function so that it can be easily instantiated and called from Python's KinesisUtils. | Class | org.apache.spark.streaming.kinesis | Apache Spark |
|
| KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeans | K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-meansalgorithm by Bahmani et al). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| KMeansDataGenerator | Generate test data for KMeans. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| KMeansModel | Model fitted by KMeans. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeansModel | A clustering model for K-means. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| KolmogorovSmirnovTestResult | Object containing the test results for the Kolmogorov-Smirnov test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| KryoRegistrator | | Interface | org.apache.spark.serializer | Apache Spark |
|
| KryoSerializer | A Spark serializer that uses the Kryo serialization library. | Class | org.apache.spark.serializer | Apache Spark |
|
| L1Updater | Updater for L1 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LabelConverter | Label to vector converter. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LabeledPoint | Class that represents the features and labels of a data point. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LabelPropagation | Label Propagation algorithm. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| LassoModel | Regression model trained using Lasso. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LassoWithSGD | Train a regression model with L1-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LBFGS | Class used to solve an optimization problem using Limited-memory BFGS. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| LDAModel | Model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LDAModel | Latent Dirichlet Allocation (LDA) model. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| LDAOptimizer | An LDAOptimizer specifies which optimization/learning/inference algorithm to use, and it can hold optimizer-specific parameters for users to set. | Interface | org.apache.spark.mllib.clustering | Apache Spark |
|
| LeafNode | Decision tree leaf node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| LeastSquaresAggregator | LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LeastSquaresCostFun | LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LeastSquaresGradient | Compute gradient and loss for a Least-squared loss function, as used in linear regression. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LessThan | A filter that evaluates to true iff the attribute evaluates to a valueSince:1. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| LessThanOrEqual | A filter that evaluates to true iff the attribute evaluates to a value less than or equal to value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| LinearDataGenerator | Generate sample data used for Linear Data. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| LinearRegression | The learning objective is to minimize the squared error, with regularization. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionModel | Model produced by LinearRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionModel | Regression model trained using LinearRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LinearRegressionSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionTrainingSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionWithSGD | Train a linear regression model with no regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| Loader | Trait for classes which can load models and transformers from files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
| LocalLDAModel | Local (non-distributed) model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LocalLDAModel | This model stores only the inferred topics. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Logging | Utility trait for classes that want to log data. | Interface | org.apache.spark | Apache Spark |
|
| LogisticAggregator | LogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticCostFun | LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticGradient | Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LogisticRegression | Logistic regression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionDataGenerator | Generate test data for LogisticRegression. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| LogisticRegressionModel | Model produced by LogisticRegression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionModel | Classification model trained using Multinomial/Binary Logistic Regression. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogisticRegressionSummary | Abstraction for Logistic Regression Results for a given model. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionTrainingSummary | Abstraction for multinomial Logistic Regression Training results. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionWithLBFGS | Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogisticRegressionWithSGD | Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogLoss | Class for log loss calculation (for classification). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| LogNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| LongParam | Specialized version of Param[Long] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| LongType | The data type representing Long values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| Loss | Trait for adding "pluggable" loss functions for the gradient boosting algorithm. | Interface | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| Losses | | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| LZ4CompressionCodec | LZ4 implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
| LZFCompressionCodec | LZF implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
| MapFunction | Base interface for a map function used in Dataset's map function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| MapGroupsFunction | Base interface for a map function used in GroupedDataset's mapGroup function. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| MapPartitionsFunction | Base interface for function used in Dataset's mapPartitions. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| MapType | The data type for Maps. | Class | org.apache.spark.sql.types | Apache Spark |
|
| MapWithStateDStream | DStream representing the stream of data generated by mapWithState operation on a Additionally, it also gives access to the stream of state snapshots, that is, the state data of | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| Matrices | Factory methods for Matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| Matrix | Trait for a local matrix. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
| MatrixEntry | Represents an entry in an distributed matrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| MatrixFactorizationModel | Model representing the result of matrix factorization. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| MemoryEntry | | Class | org.apache.spark.storage | Apache Spark |
|
| Metadata | Metadata is a wrapper over Map[String, Any] that limits the value type to simple ones: Boolean, Long, Double, String, Metadata, Array[Boolean], Array[Long], Array[Double], Array[String], and | Class | org.apache.spark.sql.types | Apache Spark |
|
| MetadataBuilder | Builder for Metadata. | Class | org.apache.spark.sql.types | Apache Spark |
|
| MethodIdentifier | Helper class to identify a method. | Class | org.apache.spark.util | Apache Spark |
|
| MFDataGenerator | Generate RDD(s) containing data for Matrix Factorization. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| Milliseconds | Helper object that creates instance of Duration representing a given number of milliseconds. | Class | org.apache.spark.streaming | Apache Spark |
|
| MinMaxScaler | Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| MinMaxScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Minutes | Helper object that creates instance of Duration representing a given number of minutes. | Class | org.apache.spark.streaming | Apache Spark |
|
| MLPairRDDFunctions | Machine learning specific Pair RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
| MLReadable | Trait for objects that provide MLReader. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| MLReader | Abstract class for utility classes that can load ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
| MLUtils | Helper methods to load, save and pre-process data used in ML Lib. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| MLWritable | Trait for classes that provide MLWriter. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| MLWriter | Abstract class for utility classes that can save ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
| Model | A fitted model, i. | Class | org.apache.spark.ml | Apache Spark |
|
| MQTTUtils | | Class | org.apache.spark.streaming.mqtt | Apache Spark |
|
| MsSqlServerDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| MulticlassClassificationEvaluator | Evaluator for multiclass classification, which expects two input columns: score and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| MulticlassMetrics | Evaluator for multiclass classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| MultilabelMetrics | Evaluator for multilabel classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| MultilayerPerceptronClassificationModel | Classification model based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| MultilayerPerceptronClassifier | Classifier trainer based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| MultivariateGaussian | This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. | Class | org.apache.spark.mllib.stat.distribution | Apache Spark |
|
| MultivariateOnlineSummarizer | MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| MultivariateStatisticalSummary | Trait for multivariate statistical summary of a data matrix. | Interface | org.apache.spark.mllib.stat | Apache Spark |
|
| MutableAggregationBuffer | A Row representing an mutable aggregation buffer. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
| MutablePair | A tuple of 2 elements. | Class | org.apache.spark.util | Apache Spark |
|
| MySQLDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| NaiveBayes | Naive Bayes Classifiers. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| NaiveBayes | | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| NaiveBayesModel | Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) | Class | org.apache.spark.ml.classification | Apache Spark |
|
| NaiveBayesModel | Model for Naive Bayes Classifiers. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| NarrowDependency | Base class for dependencies where each partition of the child RDD depends on a small number of partitions of the parent RDD. | Class | org.apache.spark | Apache Spark |
|
| NewHadoopRDD | An RDD that provides core functionality for reading data stored in Hadoop (e. | Class | org.apache.spark.rdd | Apache Spark |
|
| NGram | A feature transformer that converts the input array of strings into an array of n-grams. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Node | Decision tree node interface. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| Node | Node in a decision tree. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| NominalAttribute | A nominal attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| NoopDialect | NOOP dialect object, always returning the neutral element. | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| Normalizer | Normalize a vector to have unit norm using the given p-norm. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Normalizer | Normalizes samples individually to unit L^p^ norm For any 1 <= p < Double. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Not | A filter that evaluates to true iff child is evaluated to false. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| NullType | The data type representing NULL values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| NumericAttribute | A numeric attribute with optional summary statistics. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| NumericType | Numeric data types. | Class | org.apache.spark.sql.types | Apache Spark |
|
| OffsetRange | Represents a range of offsets from a single Kafka TopicAndPartition. | Class | org.apache.spark.streaming.kafka | Apache Spark |
|
| OneHotEncoder | A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| OneToOneDependency | Represents a one-to-one dependency between partitions of the parent and child RDDs. | Class | org.apache.spark | Apache Spark |
|
| OneVsRest | Reduction of Multiclass Classification to Binary Classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| OneVsRestModel | Model produced by OneVsRest. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| OnlineLDAOptimizer | An online optimizer for LDA. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Optimizer | Trait for optimization problem solvers. | Interface | org.apache.spark.mllib.optimization | Apache Spark |
|
| Or | A filter that evaluates to true iff at least one of left or right evaluates to true. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| OracleDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| OrderedRDDFunctions | Extra functions available on RDDs of (key, value) pairs where the key is sortable through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
| OutputMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| OutputMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| OutputOperationInfo | Class having information on output operations. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| OutputWriter | OutputWriter is used together with HadoopFsRelation for persisting rows to the underlying file system. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| OutputWriterFactory | A factory that produces OutputWriters. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| PageRank | PageRank algorithm implementation. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| PairDStreamFunctions | Extra functions available on DStream of (key, value) pairs through an implicit conversion. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| PairFlatMapFunction | A function that returns zero or more key-value pair records from each input record. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| PairFunction | A function that returns key-value pairs (Tuple2), and can be used to construct PairRDDs. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| PairRDDFunctions | Extra functions available on RDDs of (key, value) pairs through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
| PairwiseRRDD | Form an RDD[(Int, Array[Byte])] from key-value pairs returned from R. | Class | org.apache.spark.api.r | Apache Spark |
|
| Param | A param with self-contained documentation and optionally default value. | Class | org.apache.spark.ml.param | Apache Spark |
|
| ParamGridBuilder | Builder for a param grid used in grid search-based model selection. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| ParamMap | A param to value map. | Class | org.apache.spark.ml.param | Apache Spark |
|
| ParamPair | A param and its value. | Class | org.apache.spark.ml.param | Apache Spark |
|
| Params | | Interface | org.apache.spark.ml.param | Apache Spark |
|
| ParamValidators | Factory methods for common validation functions for Param. | Class | org.apache.spark.ml.param | Apache Spark |
|
| PartialResult | | Class | org.apache.spark.partial | Apache Spark |
|
| Partition | An identifier for a partition in an RDD. | Interface | org.apache.spark | Apache Spark |
|
| PartitionCoalescer | Coalesce the partitions of a parent RDD (prev) into fewer partitions, so that each partition of this RDD computes one or more of the parent ones. | Class | org.apache.spark.rdd | Apache Spark |
|
| Partitioner | An object that defines how the elements in a key-value pair RDD are partitioned by key. | Class | org.apache.spark | Apache Spark |
|
| PartitionGroup | | Class | org.apache.spark.rdd | Apache Spark |
|
| PartitionPruningRDD | A RDD used to prune RDD partitions/partitions so we can avoid launching tasks on all partitions. | Class | org.apache.spark.rdd | Apache Spark |
|
| PartitionStrategy | | Interface | org.apache.spark.graphx | Apache Spark |
|
| PCA | PCA trains a model to project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PCA | A feature transformer that projects vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| PCAModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PCAModel | Model fitted by PCA that can project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Pipeline | A simple pipeline, which acts as an estimator. | Class | org.apache.spark.ml | Apache Spark |
|
| PipelineModel | Represents a fitted pipeline. | Class | org.apache.spark.ml | Apache Spark |
|
| PipelineStage | A stage in a pipeline, either an Estimator or a Transformer. | Class | org.apache.spark.ml | Apache Spark |
|
| PMMLExportable | Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format | Interface | org.apache.spark.mllib.pmml | Apache Spark |
|
| PoissonGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| PoissonSampler | A sampler for sampling with replacement, based on values drawn from Poisson distribution. | Class | org.apache.spark.util.random | Apache Spark |
|
| PolynomialExpansion | Perform feature expansion in a polynomial space. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PortableDataStream | A class that allows DataStreams to be serialized and moved around by not creating them until they need to be read | Class | org.apache.spark.input | Apache Spark |
|
| PostgresDialect | | Class | org.apache.spark.sql.jdbc | Apache Spark |
|
| PowerIterationClustering | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| PowerIterationClusteringModel | Model produced by PowerIterationClustering. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| PrecisionInfo | Precision parameters for a DecimalSee Also:Serialized Form | Class | org.apache.spark.sql.types | Apache Spark |
|
| Predict | Predicted value for a node param: predict predicted value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| PredictionModel | Abstraction for a model for prediction tasks (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
| Predictor | Abstraction for prediction problems (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
| PrefixSpan | A parallel PrefixSpan algorithm to mine frequent sequential patterns. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| PrefixSpanModel | Model fitted by PrefixSpan param: freqSequences frequent sequences | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| Pregel | Unlike the original Pregel API, the GraphX Pregel API factors the sendMessage computation over edges, enables the message sending computation to read both vertex attributes, and constrains | Class | org.apache.spark.graphx | Apache Spark |
|
| Private | A class that is considered private to the internals of Spark -- there is a high-likelihood they will be changed in future versions of Spark. | Class | org.apache.spark.annotation | Apache Spark |
|
| ProbabilisticClassificationModel | Model produced by a ProbabilisticClassifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| ProbabilisticClassifier | Single-label binary or multiclass classifier which can output class conditional probabilities. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| PrunedFilteredScan | A BaseRelation that can eliminate unneeded columns and filter using selected predicates before producing an RDD containing all matching tuples as Row objects. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| PrunedScan | A BaseRelation that can eliminate unneeded columns before producing an RDD containing all of its tuples as Row objects. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| QRDecomposition | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| QuantileDiscretizer | QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| QuantileStrategy | Enum for selecting the quantile calculation strategySee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| QueryExecutionListener | The interface of query execution listener that can be used to analyze execution metrics. | Interface | org.apache.spark.sql.util | Apache Spark |
|
| RandomDataGenerator | Trait for random data generators that generate i. | Interface | org.apache.spark.mllib.random | Apache Spark |
|
| RandomForest | A class that implements a Random Forest learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| RandomForestClassificationModel | Random Forest model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| RandomForestClassifier | Random Forest learning algorithm for It supports both binary and multiclass labels, as well as both continuous and categorical | Class | org.apache.spark.ml.classification | Apache Spark |
|
| RandomForestModel | Represents a random forest model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| RandomForestRegressionModel | Random Forest model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RandomForestRegressor | Random Forest learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RandomRDDs | Generator methods for creating RDDs comprised of i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| RandomSampler | A pseudorandom sampler. | Interface | org.apache.spark.util.random | Apache Spark |
|
| RangeDependency | Represents a one-to-one dependency between ranges of partitions in the parent and child RDDs. | Class | org.apache.spark | Apache Spark |
|
| RangePartitioner | A Partitioner that partitions sortable records by range into roughly equal ranges. | Class | org.apache.spark | Apache Spark |
|
| RankingMetrics | Evaluator for ranking algorithms. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| Rating | A more compact class to represent a rating than Tuple3[Int, Int, Double]. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| RDD | A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. | Class | org.apache.spark.rdd | Apache Spark |
|
| RDDBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| RDDDataDistribution | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| RDDFunctions | Machine learning specific RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
| RDDInfo | | Class | org.apache.spark.storage | Apache Spark |
|
| RDDPartitionInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| RDDStorageInfo | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| Receiver | Abstract class of a receiver that can be run on worker nodes to receive external data. | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
| ReceiverInfo | Class having information about a receiverSee Also:Serialized Form | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| ReceiverInputDStream | Abstract class for defining any InputDStream that has to start a receiver on worker nodes to receive external data. | Class | org.apache.spark.streaming.dstream | Apache Spark |
|
| ReduceFunction | Base interface for function used in Dataset's reduce. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| RegexTokenizer | A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false). | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RegressionEvaluator | Evaluator for regression, which expects two input columns: prediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| RegressionMetrics | Evaluator for regression. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| RegressionModel | Model produced by a Regressor. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RegressionModel | | Interface | org.apache.spark.mllib.regression | Apache Spark |
|
| RelationProvider | Implemented by objects that produce relations for a specific kind of data source. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| Resubmitted | A ShuffleMapTask that completed successfully earlier, but we lost the executor before the stage completed. | Class | org.apache.spark | Apache Spark |
|
| ReturnStatementFinder | | Class | org.apache.spark.util | Apache Spark |
|
| ReviveOffers | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
| RFormula | Implements the transforms required for fitting a dataset against an R model formula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RFormulaModel | A fitted RFormula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RidgeRegressionModel | Regression model trained using RidgeRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| RidgeRegressionWithSGD | Train a regression model with L2-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| Row | Represents one row of output from a relational operator. | Interface | org.apache.spark.sql | Apache Spark |
|
| RowFactory | A factory class used to construct Row objects. | Class | org.apache.spark.sql | Apache Spark |
|
| RowMatrix | Represents a row-oriented distributed Matrix with no meaningful row indices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| RpcUtils | | Class | org.apache.spark.util | Apache Spark |
|
| RRDD | An RDD that stores serialized R objects as Array[Byte]. | Class | org.apache.spark.api.r | Apache Spark |
|
| RuntimePercentage | | Class | org.apache.spark.scheduler | Apache Spark |
|
| Saveable | Trait for models and transformers which may be saved as files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
| SaveMode | SaveMode is used to specify the expected behavior of saving a DataFrame to a data source. | Class | org.apache.spark.sql | Apache Spark |
|
| SchedulingMode | "FAIR" and "FIFO" determines which policy is used to order tasks amongst a Schedulable's sub-queues | Class | org.apache.spark.scheduler | Apache Spark |
|
| SchemaRelationProvider | Implemented by objects that produce relations for a specific kind of data source with a given schema. | Interface | org.apache.spark.sql.sources | Apache Spark |
|
| ScriptTransformationWriterThread | | Class | org.apache.spark.sql.hive.execution | Apache Spark |
|
| Seconds | Helper object that creates instance of Duration representing a given number of seconds. | Class | org.apache.spark.streaming | Apache Spark |
|
| SequenceFileRDDFunctions | Extra functions available on RDDs of (key, value) pairs to create a Hadoop SequenceFile, through an implicit conversion. | Class | org.apache.spark.rdd | Apache Spark |
|
| SerializableWritable | | Class | org.apache.spark | Apache Spark |
|
| SerializationStream | A stream for writing serialized objects. | Class | org.apache.spark.serializer | Apache Spark |
|
| Serializer | A serializer. | Class | org.apache.spark.serializer | Apache Spark |
|
| SerializerInstance | An instance of a serializer, for use by one thread at a time. | Class | org.apache.spark.serializer | Apache Spark |
|
| ShortestPaths | Computes shortest paths to the given set of landmark vertices, returning a graph where each vertex attribute is a map containing the shortest-path distance to each reachable landmark. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| ShortType | The data type representing Short values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| ShuffleBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| ShuffleDataBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| ShuffleDependency | Represents a dependency on the output of a shuffle stage. | Class | org.apache.spark | Apache Spark |
|
| ShuffledRDD | The resulting RDD from a shuffle (e. | Class | org.apache.spark.rdd | Apache Spark |
|
| ShuffleIndexBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| ShuffleReadMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ShuffleReadMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ShuffleWriteMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| ShuffleWriteMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| SignalLoggerHandler | | Class | org.apache.spark.util | Apache Spark |
|
| SimpleFutureAction | A FutureAction holding the result of an action that triggers a single job. | Class | org.apache.spark | Apache Spark |
|
| SimpleUpdater | A simple updater for gradient descent *without* any regularization. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| SingularValueDecomposition | Represents singular value decomposition (SVD) factors. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| SizeEstimator | Estimates the sizes of Java objects (number of bytes of memory they occupy), for use in memory-aware caches. | Class | org.apache.spark.util | Apache Spark |
|
| SnappyCompressionCodec | Snappy implementation of CompressionCodec. | Class | org.apache.spark.io | Apache Spark |
|
| SnappyOutputStreamWrapper | Wrapper over SnappyOutputStream which guards against write-after-close and double-close issues. | Class | org.apache.spark.io | Apache Spark |
|
| SparkAppHandle | A handle to a running Spark application. | Interface | org.apache.spark.launcher | Apache Spark |
|
| SparkConf | Configuration for a Spark application. | Class | org.apache.spark | Apache Spark |
|
| SparkContext | Main entry point for Spark functionality. | Class | org.apache.spark | Apache Spark |
|
| SparkEnv | Holds all the runtime environment objects for a running Spark instance (either master or worker), including the serializer, Akka actor system, block manager, map output tracker, etc. | Class | org.apache.spark | Apache Spark |
|
| SparkException | | Class | org.apache.spark | Apache Spark |
|
| SparkFiles | Resolves paths to files added through SparkContext. | Class | org.apache.spark | Apache Spark |
|
| SparkFirehoseListener | Class that allows users to receive all SparkListener events. | Class | org.apache.spark | Apache Spark |
|
| SparkFlumeEvent | A wrapper class for AvroFlumeEvent's with a custom serialization format. | Class | org.apache.spark.streaming.flume | Apache Spark |
|
| SparkJobInfo | Exposes information about Spark Jobs. | Interface | org.apache.spark | Apache Spark |
|
| SparkJobInfoImpl | | Class | org.apache.spark | Apache Spark |
|
| SparkLauncher | Launcher for Spark applications. | Class | org.apache.spark.launcher | Apache Spark |
|
| SparkListener | Interface for listening to events from the Spark scheduler. | Interface | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerApplicationEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerApplicationStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerBlockManagerAdded | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerBlockManagerRemoved | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerBlockUpdated | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerEnvironmentUpdate | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerExecutorAdded | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerExecutorMetricsUpdate | Periodic updates from executors. | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerExecutorRemoved | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerJobEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerJobStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerStageCompleted | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerStageSubmitted | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerTaskEnd | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerTaskGettingResult | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerTaskStart | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkListenerUnpersistRDD | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SparkMasterRegex | A collection of regexes for extracting information from the master string. | Class | org.apache.spark | Apache Spark |
|
| SparkShutdownHook | | Class | org.apache.spark.util | Apache Spark |
|
| SparkStageInfo | Exposes information about Spark Stages. | Interface | org.apache.spark | Apache Spark |
|
| SparkStageInfoImpl | | Class | org.apache.spark | Apache Spark |
|
| SparkStatusTracker | Low-level status reporting APIs for monitoring job and stage progress. | Class | org.apache.spark | Apache Spark |
|
| SparseMatrix | Column-major sparse matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| SparseVector | A sparse vector represented by an index array and an value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| SpecialLengths | | Class | org.apache.spark.api.r | Apache Spark |
|
| SpillListener | A SparkListener that detects whether spills have occurred in Spark jobs. | Class | org.apache.spark | Apache Spark |
|
| Split | Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path. | Interface | org.apache.spark.ml.tree | Apache Spark |
|
| Split | Split applied to a feature param: feature feature index | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| SplitInfo | | Class | org.apache.spark.scheduler | Apache Spark |
|
| SQLContext | The entry point for working with structured data (rows and columns) in Spark. | Class | org.apache.spark.sql | Apache Spark |
|
| SQLImplicits | A collection of implicit methods for converting common Scala objects into DataFrames. | Class | org.apache.spark.sql | Apache Spark |
|
| SQLTransformer | Implements the transformations which are defined by SQL statement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| SQLUserDefinedType | A user-defined type which can be automatically recognized by a SQLContext and registered. | Class | org.apache.spark.sql.types | Apache Spark |
|
| SquaredError | Class for squared error loss calculation. | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| SquaredL2Updater | Updater for L2 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| StageData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| StageInfo | Stores information about a stage to pass from the scheduler to SparkListeners. | Class | org.apache.spark.scheduler | Apache Spark |
|
| StageStatus | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| StandardNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| StandardScaler | Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StandardScaler | Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| StandardScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StandardScalerModel | Represents a StandardScaler model that can transform vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| StatCounter | A class for tracking the statistics of a set of numbers (count, mean and variance) in a numerically robust way. | Class | org.apache.spark.util | Apache Spark |
|
| State | Abstract class for getting and updating the state in mapping function used in the mapWithState operation of a pair DStream (Scala) | Class | org.apache.spark.streaming | Apache Spark |
|
| StateSpec | Abstract class representing all the specifications of the DStream transformation mapWithState operation of a | Class | org.apache.spark.streaming | Apache Spark |
|
| Statistics | | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| Statistics | Statistics for querying the supervisor about state of workers. | Class | org.apache.spark.streaming.receiver | Apache Spark |
|
| StatsReportListener | | Class | org.apache.spark.scheduler | Apache Spark |
|
| StatsReportListener | A simple StreamingListener that logs summary statistics across Spark Streaming batches param: numBatchInfos Number of last batches to consider for generating statistics (default: 10) | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StatusUpdate | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
| StopCoordinator | | Class | org.apache.spark.scheduler | Apache Spark |
|
| StopExecutor | | Class | org.apache.spark.scheduler.local | Apache Spark |
|
| StopWordsRemover | A feature transformer that filters out stop words from input. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StorageLevel | Flags for controlling the storage of an RDD. | Class | org.apache.spark.storage | Apache Spark |
|
| StorageLevels | Expose some commonly useful storage level constants. | Class | org.apache.spark.api.java | Apache Spark |
|
| StorageListener | A SparkListener that prepares information to be displayed on the BlockManagerUI. | Class | org.apache.spark.ui.storage | Apache Spark |
|
| StorageStatus | Storage information for each BlockManager. | Class | org.apache.spark.storage | Apache Spark |
|
| StorageStatusListener | A SparkListener that maintains executor storage status. | Class | org.apache.spark.storage | Apache Spark |
|
| Strategy | Stores all the configuration options for tree construction param: algo Learning goal. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| StreamBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| StreamingContext | Main entry point for Spark Streaming functionality. | Class | org.apache.spark.streaming | Apache Spark |
|
| StreamingContextState | enum StreamingContextState Represents the state of a StreamingContext. | Class | org.apache.spark.streaming | Apache Spark |
|
| StreamingKMeans | StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| StreamingKMeansModel | StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| StreamingLinearAlgorithm | StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| StreamingLinearRegressionWithSGD | Train or predict a linear regression model on streaming data. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| StreamingListener | | Interface | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerBatchCompleted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerBatchStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerBatchSubmitted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerOutputOperationCompleted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerOutputOperationStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerReceiverError | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerReceiverStarted | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingListenerReceiverStopped | | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StreamingLogisticRegressionWithSGD | Train or predict a logistic regression model on streaming data. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| StreamingTest | | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| StreamInputInfo | Track the information of input stream at specified batch time. | Class | org.apache.spark.streaming.scheduler | Apache Spark |
|
| StringArrayParam | Specialized version of Param[Array[String} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| StringContains | A filter that evaluates to true iff the attribute evaluates to a string that contains the string value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| StringEndsWith | A filter that evaluates to true iff the attribute evaluates to a string that starts with value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| StringIndexer | A label indexer that maps a string column of labels to an ML column of label indices. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StringIndexerModel | Model fitted by StringIndexer. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StringRRDD | An RDD that stores R objects as Array[String]. | Class | org.apache.spark.api.r | Apache Spark |
|
| StringStartsWith | A filter that evaluates to true iff the attribute evaluates to a string that starts with value. | Class | org.apache.spark.sql.sources | Apache Spark |
|
| StringType | The data type representing String values. | Class | org.apache.spark.sql.types | Apache Spark |
|
| StronglyConnectedComponents | Strongly connected components algorithm implementation. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| StructField | A field inside a StructType. | Class | org.apache.spark.sql.types | Apache Spark |
|
| StructType | A StructType object can be constructed by StructType(fields: Seq[StructField]) | Class | org.apache.spark.sql.types | Apache Spark |
|
| Success | | Class | org.apache.spark | Apache Spark |
|
| SVDPlusPlus | | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| SVMDataGenerator | Generate sample data used for SVM. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| SVMModel | Model for Support Vector Machines (SVMs). | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| SVMWithSGD | Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| TaskCommitDenied | Task requested the driver to commit, but was denied. | Class | org.apache.spark | Apache Spark |
|
| TaskCompletionListener | Listener providing a callback function to invoke when a task's execution completes. | Interface | org.apache.spark.util | Apache Spark |
|
| TaskContext | Contextual information about a task which can be read or mutated during execution. | Class | org.apache.spark | Apache Spark |
|
| TaskData | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| TaskFailedReason | Various possible reasons why a task failed. | Interface | org.apache.spark | Apache Spark |
|
| TaskInfo | Information about a running task attempt inside a TaskSet. | Class | org.apache.spark.scheduler | Apache Spark |
|
| TaskKilled | Task was killed intentionally and needs to be rescheduled. | Class | org.apache.spark | Apache Spark |
|
| TaskKilledException | Exception thrown when a task is explicitly killed (i. | Class | org.apache.spark | Apache Spark |
|
| TaskLocality | | Class | org.apache.spark.scheduler | Apache Spark |
|
| TaskMetricDistributions | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| TaskMetrics | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| TaskResultBlockId | | Class | org.apache.spark.storage | Apache Spark |
|
| TaskResultLost | The task finished successfully, but the result was lost from the executor's block manager beforeSee Also:Serialized Form | Class | org.apache.spark | Apache Spark |
|
| TaskSorting | | Class | org.apache.spark.status.api.v1 | Apache Spark |
|
| TestResult | Trait for hypothesis test results. | Interface | org.apache.spark.mllib.stat.test | Apache Spark |
|
| Time | This is a simple class that represents an absolute instant of time. | Class | org.apache.spark.streaming | Apache Spark |
|
| TimestampType | The data type representing java. | Class | org.apache.spark.sql.types | Apache Spark |
|
| TimeTrackingOutputStream | Intercepts write calls and tracks total time spent writing in order to update shuffle write metrics. | Class | org.apache.spark.storage | Apache Spark |
|
| Tokenizer | A tokenizer that converts the input string to lowercase and then splits it by white spaces. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| TorrentBroadcastFactory | A Broadcast implementation that uses a BitTorrent-like protocol to do a distributed transfer of the broadcasted data to the executors. | Class | org.apache.spark.broadcast | Apache Spark |
|
| TrainValidationSplit | Validation for hyper-parameter tuning. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| TrainValidationSplitModel | Model from train validation split. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| Transformer | Abstract class for transformers that transform one dataset into another. | Class | org.apache.spark.ml | Apache Spark |
|
| TriangleCount | Compute the number of triangles passing through each vertex. | Class | org.apache.spark.graphx.lib | Apache Spark |
|
| TripletFields | Represents a subset of the fields of an [[EdgeTriplet]] or [[EdgeContext]]. | Class | org.apache.spark.graphx | Apache Spark |
|
| TwitterUtils | | Class | org.apache.spark.streaming.twitter | Apache Spark |
|
| TypedColumn | A Column where an Encoder has been given for the expected input and return type. | Class | org.apache.spark.sql | Apache Spark |
|
| UDF10 | A Spark SQL UDF that has 10 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF11 | A Spark SQL UDF that has 11 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF12 | A Spark SQL UDF that has 12 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF13 | A Spark SQL UDF that has 13 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF14 | A Spark SQL UDF that has 14 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF15 | A Spark SQL UDF that has 15 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF16 | A Spark SQL UDF that has 16 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF17 | A Spark SQL UDF that has 17 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF18 | A Spark SQL UDF that has 18 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF19 | A Spark SQL UDF that has 19 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF20 | A Spark SQL UDF that has 20 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF21 | A Spark SQL UDF that has 21 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF22 | A Spark SQL UDF that has 22 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF3 | A Spark SQL UDF that has 3 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF4 | A Spark SQL UDF that has 4 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF5 | A Spark SQL UDF that has 5 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF6 | A Spark SQL UDF that has 6 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF7 | A Spark SQL UDF that has 7 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF8 | A Spark SQL UDF that has 8 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDF9 | A Spark SQL UDF that has 9 arguments. | Interface | org.apache.spark.sql.api.java | Apache Spark |
|
| UDFRegistration | Functions for registering user-defined functions. | Class | org.apache.spark.sql | Apache Spark |
|
| UnaryTransformer | Abstract class for transformers that take one input column, apply transformation, and output the result as a new column. | Class | org.apache.spark.ml | Apache Spark |
|
| UniformGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| UnionRDD | | Class | org.apache.spark.rdd | Apache Spark |
|
| UnknownReason | We don't know why the task ended -- for example, because of a ClassNotFound exception when deserializing the task result. | Class | org.apache.spark | Apache Spark |
|
| UnresolvedAttribute | An unresolved attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| Updater | Class used to perform steps (weight update) using Gradient Descent methods. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| UserDefinedAggregateFunction | The base class for implementing user-defined aggregate functions (UDAF). | Class | org.apache.spark.sql.expressions | Apache Spark |
|
| UserDefinedFunction | A user-defined function. | Class | org.apache.spark.sql | Apache Spark |
|
| UserDefinedType | The data type for User Defined Types (UDTs). | Class | org.apache.spark.sql.types | Apache Spark |
|
| Variance | Class for calculating variance during regressionSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| Vector | Represents a numeric vector, whose index type is Int and value type is Double. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
| Vector | | Class | org.apache.spark.util | Apache Spark |
|
| VectorAssembler | A feature transformer that merges multiple columns into a vector column. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorAttributeRewriter | Utility transformer that rewrites Vector attribute names via prefix replacement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorIndexer | Class for indexing categorical feature columns in a dataset of Vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorIndexerModel | Transform categorical features to use 0-based indices instead of their original values. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Vectors | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| VectorSlicer | This class takes a feature vector and outputs a new feature vector with a subarray of the The subset of features can be specified with either indices (setIndices()) | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorTransformer | | Interface | org.apache.spark.mllib.feature | Apache Spark |
|
| VectorUDT | :: AlphaComponent :: User-defined type for Vector which allows easy interaction with SQL | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| VertexRDD | pre-indexing the entries for fast, efficient joins. | Class | org.apache.spark.graphx | Apache Spark |
|
| VertexRDDImpl | | Class | org.apache.spark.graphx.impl | Apache Spark |
|
| VocabWord | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| VoidFunction2 | A two-argument function that takes arguments of type T1 and T2 with no return value. | Interface | org.apache.spark.api.java.function | Apache Spark |
|
| WeibullGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| Window | Utility functions for defining window in DataFrames. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
| WindowSpec | A window specification that defines the partitioning, ordering, and frame boundaries. | Class | org.apache.spark.sql.expressions | Apache Spark |
|
| Word2Vec | Word2Vec trains a model of Map(String, Vector), i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Word2Vec | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Word2VecModel | Model fitted by Word2Vec. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Word2VecModel | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| WriteAheadLog | This abstract class represents a write ahead log (aka journal) that is used by Spark Streaming to save the received data (by receivers) and associated metadata to a reliable storage, so that | Class | org.apache.spark.streaming.util | Apache Spark |
|
| WriteAheadLogRecordHandle | This abstract class represents a handle that refers to a record written in a It must contain all the information necessary for the record to be read and returned by | Class | org.apache.spark.streaming.util | Apache Spark |
|
| ZeroMQUtils | | Class | org.apache.spark.streaming.zeromq | Apache Spark |