Name | Description | Type | Package | Framework |
AbsoluteError | Class for absolute error loss calculation (for regression). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
AFTAggregator | | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTCostFun | | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTSurvivalRegression | Fit a parametric survival regression model named accelerated failure time (AFT) model (https://en. | Class | org.apache.spark.ml.regression | Apache Spark |
|
AFTSurvivalRegressionModel | Model produced by AFTSurvivalRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
Algo | Enum to select the algorithm for the decision treeSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
ALS | Alternating Least Squares (ALS) matrix factorization. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
ALS | | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
ALS .Rating | Rating class for better code readability. | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
ALS .Rating$ | | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
ALSModel | Model fitted by ALS. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
AssociationRules | Generates association rules from a RDD[FreqItemset[Item]. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
AssociationRules .Rule | An association rule between sets of items. | Class | org.apache.spark.mllib.fpm.AssociationRules | Apache Spark |
|
Attribute | Abstract class for ML attributes. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
AttributeGroup | Attributes that describe a vector ML column. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
AttributeType | An enum-like type for attribute types: AttributeType$. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
Binarizer | Binarize a column of continuous features given a threshold. | Class | org.apache.spark.ml.feature | Apache Spark |
|
BinaryAttribute | A binary attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
BinaryClassificationEvaluator | Evaluator for binary classification, which expects two input columns: rawPrediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
BinaryClassificationMetrics | Evaluator for binary classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
BinaryLogisticRegressionSummary | Binary Logistic regression results for a given model. | Class | org.apache.spark.ml.classification | Apache Spark |
|
BinaryLogisticRegressionTrainingSummary | Logistic regression training results. | Class | org.apache.spark.ml.classification | Apache Spark |
|
BinarySample | Class that represents the group and value of a sample. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
BisectingKMeans | A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
BisectingKMeansModel | Clustering model produced by BisectingKMeans. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
BlockMatrix | Represents a distributed matrix in blocks of local matrices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
BooleanParam | Specialized version of Param[Boolean] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
BoostingStrategy | Configuration options for GradientBoostedTrees. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
Bucketizer | Bucketizer maps a column of continuous features to a column of feature buckets. | Class | org.apache.spark.ml.feature | Apache Spark |
|
CategoricalSplit | Split which tests a categorical feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
ChiSqSelector | Chi-Squared feature selection, which selects categorical features to use for predicting aSee Also:Serialized Form | Class | org.apache.spark.ml.feature | Apache Spark |
|
ChiSqSelector | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
ChiSqSelectorModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
ChiSqSelectorModel | Chi Squared selector model. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
ChiSqTestResult | Object containing the test results for the chi-squared hypothesis test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
ClassificationModel | Model produced by a Classifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
ClassificationModel | Represents a classification model that predicts to which of a set of categories an example belongs. | Interface | org.apache.spark.mllib.classification | Apache Spark |
|
Classifier | Single-label binary or multiclass classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
ColumnPruner | Utility transformer for removing temporary columns from a DataFrame. | Class | org.apache.spark.ml.feature | Apache Spark |
|
ContinuousSplit | Split which tests a continuous feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
CoordinateMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
CountVectorizer | Extracts a vocabulary from document collections and generates a CountVectorizerModel. | Class | org.apache.spark.ml.feature | Apache Spark |
|
CountVectorizerModel | Converts a text document to a sparse vector of token counts. | Class | org.apache.spark.ml.feature | Apache Spark |
|
CrossValidator | K-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
CrossValidatorModel | Model from k-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
DataValidators | A collection of methods used to validate data before applying ML algorithms. | Class | org.apache.spark.mllib.util | Apache Spark |
|
DCT | A feature transformer that takes the 1D discrete cosine transform of a real vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
DecisionTree | A class which implements a decision tree learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
DecisionTreeClassificationModel | Decision tree model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
DecisionTreeClassifier | Decision tree learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
DecisionTreeModel | Decision tree model for classification or regression. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
DecisionTreeRegressionModel | Decision tree model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
DecisionTreeRegressor | Decision tree learning algorithm It supports both continuous and categorical features. | Class | org.apache.spark.ml.regression | Apache Spark |
|
DefaultSource | libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame. | Class | org.apache.spark.ml.source.libsvm | Apache Spark |
|
DenseMatrix | Column-major dense matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
DenseVector | A dense vector represented by a value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
DistributedLDAModel | Distributed model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
DistributedLDAModel | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
DistributedMatrix | Represents a distributively stored matrix backed by one or more RDDs. | Interface | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
DoubleArrayParam | Specialized version of Param[Array[Double} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
DoubleParam | Specialized version of Param[Double] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
EMLDAOptimizer | Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Entropy | Class for calculating entropy during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
Estimator | Abstract class for estimators that fit models to data. | Class | org.apache.spark.ml | Apache Spark |
|
Evaluator | Abstract class for evaluators that compute metrics from predictions. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
ExpectationSum | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
ExponentialGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
FeatureType | Enum to describe whether a feature is "continuous" or "categorical"See Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
FloatParam | Specialized version of Param[Float] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
FPGrowth | A parallel FP-growth algorithm to mine frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
FPGrowth .FreqItemset | param: items items in this itemset. | Class | org.apache.spark.mllib.fpm.FPGrowth | Apache Spark |
|
FPGrowthModel | Model trained by FPGrowth, which holds frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
GammaGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
GaussianMixture | This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
GaussianMixtureModel | Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
GBTClassificationModel | Gradient-Boosted Trees (GBTs) model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
GBTClassifier | Gradient-Boosted Trees (GBTs) learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
GBTRegressionModel | Gradient-Boosted Trees (GBTs) model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
GBTRegressor | Gradient-Boosted Trees (GBTs) learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
GeneralizedLinearAlgorithm | GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM). | Class | org.apache.spark.mllib.regression | Apache Spark |
|
GeneralizedLinearModel | GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
Gini | Class for calculating the during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
Gradient | Class used to compute the gradient for a loss function, given a single data point. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
GradientBoostedTrees | A class that implements Stochastic Gradient Boosting | Class | org.apache.spark.mllib.tree | Apache Spark |
|
GradientBoostedTreesModel | Represents a gradient boosted trees model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
GradientDescent | Class used to solve an optimization problem using Gradient Descent. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.ml.feature | Apache Spark |
|
HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
HingeGradient | Compute gradient and loss for a Hinge loss function, as used in SVM binary classification. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
Identifiable | Trait for an object with an immutable unique ID that identifies itself and its derivatives. | Interface | org.apache.spark.ml.util | Apache Spark |
|
IDF | Compute the Inverse Document Frequency (IDF) given a collection of documents. | Class | org.apache.spark.ml.feature | Apache Spark |
|
IDF | Inverse document frequency (IDF). | Class | org.apache.spark.mllib.feature | Apache Spark |
|
IDF .DocumentFrequencyAggregator | Document frequency aggregator. | Class | org.apache.spark.mllib.feature.IDF | Apache Spark |
|
IDFModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
IDFModel | Represents an IDF model that can transform term frequency vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Impurity | Trait for calculating information gain. | Interface | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
IndexedRow | Represents a row of IndexedRowMatrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
IndexedRowMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
IndexToString | | Class | org.apache.spark.ml.feature | Apache Spark |
|
InformationGainStats | Information gain statistics for each split param: gain information gain value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
IntArrayParam | Specialized version of Param[Array[Int} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
Interaction | Implements the feature interaction transform. | Class | org.apache.spark.ml.feature | Apache Spark |
|
InternalNode | Internal Decision Tree node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
IntParam | Specialized version of Param[Int] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
IsotonicRegression | | Class | org.apache.spark.ml.regression | Apache Spark |
|
IsotonicRegression | | Class | org.apache.spark.mllib.regression | Apache Spark |
|
IsotonicRegressionModel | Model fitted by IsotonicRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
IsotonicRegressionModel | Regression model for isotonic regression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
JavaParams | Java-friendly wrapper for Params. | Class | org.apache.spark.ml.param | Apache Spark |
|
KernelDensity | Kernel density estimation. | Class | org.apache.spark.mllib.stat | Apache Spark |
|
KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeans | K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-meansalgorithm by Bahmani et al). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
KMeansDataGenerator | Generate test data for KMeans. | Class | org.apache.spark.mllib.util | Apache Spark |
|
KMeansModel | Model fitted by KMeans. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
KMeansModel | A clustering model for K-means. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
KolmogorovSmirnovTestResult | Object containing the test results for the Kolmogorov-Smirnov test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
L1Updater | Updater for L1 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LabelConverter | Label to vector converter. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LabeledPoint | Class that represents the features and labels of a data point. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LassoModel | Regression model trained using Lasso. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LassoWithSGD | Train a regression model with L1-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LBFGS | Class used to solve an optimization problem using Limited-memory BFGS. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
LDAModel | Model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LDAModel | Latent Dirichlet Allocation (LDA) model. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
LDAOptimizer | An LDAOptimizer specifies which optimization/learning/inference algorithm to use, and it can hold optimizer-specific parameters for users to set. | Interface | org.apache.spark.mllib.clustering | Apache Spark |
|
LeafNode | Decision tree leaf node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
LeastSquaresAggregator | LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LeastSquaresCostFun | LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LeastSquaresGradient | Compute gradient and loss for a Least-squared loss function, as used in linear regression. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LinearDataGenerator | Generate sample data used for Linear Data. | Class | org.apache.spark.mllib.util | Apache Spark |
|
LinearRegression | The learning objective is to minimize the squared error, with regularization. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionModel | Model produced by LinearRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionModel | Regression model trained using LinearRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
LinearRegressionSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionTrainingSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
LinearRegressionWithSGD | Train a linear regression model with no regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
Loader | Trait for classes which can load models and transformers from files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
LocalLDAModel | Local (non-distributed) model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
LocalLDAModel | This model stores only the inferred topics. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
LogisticAggregator | LogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticCostFun | LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticGradient | Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
LogisticRegression | Logistic regression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionDataGenerator | Generate test data for LogisticRegression. | Class | org.apache.spark.mllib.util | Apache Spark |
|
LogisticRegressionModel | Model produced by LogisticRegression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionModel | Classification model trained using Multinomial/Binary Logistic Regression. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogisticRegressionSummary | Abstraction for Logistic Regression Results for a given model. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionTrainingSummary | Abstraction for multinomial Logistic Regression Training results. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
LogisticRegressionWithLBFGS | Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogisticRegressionWithSGD | Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
LogLoss | Class for log loss calculation (for classification). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
LogNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
LongParam | Specialized version of Param[Long] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
Loss | Trait for adding "pluggable" loss functions for the gradient boosting algorithm. | Interface | org.apache.spark.mllib.tree.loss | Apache Spark |
|
Losses | | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
Matrices | Factory methods for Matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
Matrix | Trait for a local matrix. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
MatrixEntry | Represents an entry in an distributed matrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
MatrixFactorizationModel | Model representing the result of matrix factorization. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
MFDataGenerator | Generate RDD(s) containing data for Matrix Factorization. | Class | org.apache.spark.mllib.util | Apache Spark |
|
MinMaxScaler | Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling. | Class | org.apache.spark.ml.feature | Apache Spark |
|
MinMaxScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
MLPairRDDFunctions | Machine learning specific Pair RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
MLReadable | Trait for objects that provide MLReader. | Interface | org.apache.spark.ml.util | Apache Spark |
|
MLReader | Abstract class for utility classes that can load ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
MLUtils | Helper methods to load, save and pre-process data used in ML Lib. | Class | org.apache.spark.mllib.util | Apache Spark |
|
MLWritable | Trait for classes that provide MLWriter. | Interface | org.apache.spark.ml.util | Apache Spark |
|
MLWriter | Abstract class for utility classes that can save ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
Model | A fitted model, i. | Class | org.apache.spark.ml | Apache Spark |
|
MulticlassClassificationEvaluator | Evaluator for multiclass classification, which expects two input columns: score and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
MulticlassMetrics | Evaluator for multiclass classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
MultilabelMetrics | Evaluator for multilabel classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
MultilayerPerceptronClassificationModel | Classification model based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
MultilayerPerceptronClassifier | Classifier trainer based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
MultivariateGaussian | This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. | Class | org.apache.spark.mllib.stat.distribution | Apache Spark |
|
MultivariateOnlineSummarizer | MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector | Class | org.apache.spark.mllib.stat | Apache Spark |
|
MultivariateStatisticalSummary | Trait for multivariate statistical summary of a data matrix. | Interface | org.apache.spark.mllib.stat | Apache Spark |
|
NaiveBayes | Naive Bayes Classifiers. | Class | org.apache.spark.ml.classification | Apache Spark |
|
NaiveBayes | | Class | org.apache.spark.mllib.classification | Apache Spark |
|
NaiveBayesModel | Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) | Class | org.apache.spark.ml.classification | Apache Spark |
|
NaiveBayesModel | Model for Naive Bayes Classifiers. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
NGram | A feature transformer that converts the input array of strings into an array of n-grams. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Node | Decision tree node interface. | Class | org.apache.spark.ml.tree | Apache Spark |
|
Node | Node in a decision tree. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
NominalAttribute | A nominal attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
Normalizer | Normalize a vector to have unit norm using the given p-norm. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Normalizer | Normalizes samples individually to unit L^p^ norm For any 1 <= p < Double. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
NumericAttribute | A numeric attribute with optional summary statistics. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
OneHotEncoder | A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index. | Class | org.apache.spark.ml.feature | Apache Spark |
|
OneVsRest | Reduction of Multiclass Classification to Binary Classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
OneVsRestModel | Model produced by OneVsRest. | Class | org.apache.spark.ml.classification | Apache Spark |
|
OnlineLDAOptimizer | An online optimizer for LDA. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Optimizer | Trait for optimization problem solvers. | Interface | org.apache.spark.mllib.optimization | Apache Spark |
|
Param | A param with self-contained documentation and optionally default value. | Class | org.apache.spark.ml.param | Apache Spark |
|
ParamGridBuilder | Builder for a param grid used in grid search-based model selection. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
ParamMap | A param to value map. | Class | org.apache.spark.ml.param | Apache Spark |
|
ParamPair | A param and its value. | Class | org.apache.spark.ml.param | Apache Spark |
|
Params | | Interface | org.apache.spark.ml.param | Apache Spark |
|
ParamValidators | Factory methods for common validation functions for Param. | Class | org.apache.spark.ml.param | Apache Spark |
|
PCA | PCA trains a model to project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.ml.feature | Apache Spark |
|
PCA | A feature transformer that projects vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
PCAModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
PCAModel | Model fitted by PCA that can project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Pipeline | A simple pipeline, which acts as an estimator. | Class | org.apache.spark.ml | Apache Spark |
|
PipelineModel | Represents a fitted pipeline. | Class | org.apache.spark.ml | Apache Spark |
|
PipelineStage | A stage in a pipeline, either an Estimator or a Transformer. | Class | org.apache.spark.ml | Apache Spark |
|
PMMLExportable | Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format | Interface | org.apache.spark.mllib.pmml | Apache Spark |
|
PoissonGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
PolynomialExpansion | Perform feature expansion in a polynomial space. | Class | org.apache.spark.ml.feature | Apache Spark |
|
PowerIterationClustering | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
PowerIterationClustering .Assignment | param: cluster assigned cluster idSee Also:Serialized Form | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
PowerIterationClustering .Assignment$ | | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
PowerIterationClusteringModel | Model produced by PowerIterationClustering. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
Predict | Predicted value for a node param: predict predicted value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
PredictionModel | Abstraction for a model for prediction tasks (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
Predictor | Abstraction for prediction problems (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
PrefixSpan | A parallel PrefixSpan algorithm to mine frequent sequential patterns. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
PrefixSpan .FreqSequence | Represents a frequence sequence. | Class | org.apache.spark.mllib.fpm.PrefixSpan | Apache Spark |
|
PrefixSpanModel | Model fitted by PrefixSpan param: freqSequences frequent sequences | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
ProbabilisticClassificationModel | Model produced by a ProbabilisticClassifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
ProbabilisticClassifier | Single-label binary or multiclass classifier which can output class conditional probabilities. | Class | org.apache.spark.ml.classification | Apache Spark |
|
QRDecomposition | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
QuantileDiscretizer | QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features. | Class | org.apache.spark.ml.feature | Apache Spark |
|
QuantileStrategy | Enum for selecting the quantile calculation strategySee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
RandomDataGenerator | Trait for random data generators that generate i. | Interface | org.apache.spark.mllib.random | Apache Spark |
|
RandomForest | A class that implements a Random Forest learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
RandomForestClassificationModel | Random Forest model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
RandomForestClassifier | Random Forest learning algorithm for It supports both binary and multiclass labels, as well as both continuous and categorical | Class | org.apache.spark.ml.classification | Apache Spark |
|
RandomForestModel | Represents a random forest model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
RandomForestRegressionModel | Random Forest model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RandomForestRegressor | Random Forest learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RandomRDDs | Generator methods for creating RDDs comprised of i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
RankingMetrics | Evaluator for ranking algorithms. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
Rating | A more compact class to represent a rating than Tuple3[Int, Int, Double]. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
RDDFunctions | Machine learning specific RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
RegexTokenizer | A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false). | Class | org.apache.spark.ml.feature | Apache Spark |
|
RegressionEvaluator | Evaluator for regression, which expects two input columns: prediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
RegressionMetrics | Evaluator for regression. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
RegressionModel | Model produced by a Regressor. | Class | org.apache.spark.ml.regression | Apache Spark |
|
RegressionModel | | Interface | org.apache.spark.mllib.regression | Apache Spark |
|
RFormula | Implements the transforms required for fitting a dataset against an R model formula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
RFormulaModel | A fitted RFormula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
RidgeRegressionModel | Regression model trained using RidgeRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
RidgeRegressionWithSGD | Train a regression model with L2-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
RowMatrix | Represents a row-oriented distributed Matrix with no meaningful row indices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
Saveable | Trait for models and transformers which may be saved as files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
SimpleUpdater | A simple updater for gradient descent *without* any regularization. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
SingularValueDecomposition | Represents singular value decomposition (SVD) factors. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
SparseMatrix | Column-major sparse matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
SparseVector | A sparse vector represented by an index array and an value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
Split | Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path. | Interface | org.apache.spark.ml.tree | Apache Spark |
|
Split | Split applied to a feature param: feature feature index | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
SQLTransformer | Implements the transformations which are defined by SQL statement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
SquaredError | Class for squared error loss calculation. | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
SquaredL2Updater | Updater for L2 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
StandardNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
StandardScaler | Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StandardScaler | Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
StandardScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
StandardScalerModel | Represents a StandardScaler model that can transform vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Statistics | | Class | org.apache.spark.mllib.stat | Apache Spark |
|
StopWordsRemover | A feature transformer that filters out stop words from input. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Strategy | Stores all the configuration options for tree construction param: algo Learning goal. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
StreamingKMeans | StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
StreamingKMeansModel | StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
StreamingLinearAlgorithm | StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, | Class | org.apache.spark.mllib.regression | Apache Spark |
|
StreamingLinearRegressionWithSGD | Train or predict a linear regression model on streaming data. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
StreamingLogisticRegressionWithSGD | Train or predict a logistic regression model on streaming data. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
StreamingTest | | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
StringArrayParam | Specialized version of Param[Array[String} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
StringIndexer | A label indexer that maps a string column of labels to an ML column of label indices. | Class | org.apache.spark.ml.feature | Apache Spark |
|
StringIndexerModel | Model fitted by StringIndexer. | Class | org.apache.spark.ml.feature | Apache Spark |
|
SVMDataGenerator | Generate sample data used for SVM. | Class | org.apache.spark.mllib.util | Apache Spark |
|
SVMModel | Model for Support Vector Machines (SVMs). | Class | org.apache.spark.mllib.classification | Apache Spark |
|
SVMWithSGD | Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
TestResult | Trait for hypothesis test results. | Interface | org.apache.spark.mllib.stat.test | Apache Spark |
|
Tokenizer | A tokenizer that converts the input string to lowercase and then splits it by white spaces. | Class | org.apache.spark.ml.feature | Apache Spark |
|
TrainValidationSplit | Validation for hyper-parameter tuning. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
TrainValidationSplitModel | Model from train validation split. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
Transformer | Abstract class for transformers that transform one dataset into another. | Class | org.apache.spark.ml | Apache Spark |
|
UnaryTransformer | Abstract class for transformers that take one input column, apply transformation, and output the result as a new column. | Class | org.apache.spark.ml | Apache Spark |
|
UniformGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
UnresolvedAttribute | An unresolved attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
Updater | Class used to perform steps (weight update) using Gradient Descent methods. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
Variance | Class for calculating variance during regressionSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
Vector | Represents a numeric vector, whose index type is Int and value type is Double. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
VectorAssembler | A feature transformer that merges multiple columns into a vector column. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorAttributeRewriter | Utility transformer that rewrites Vector attribute names via prefix replacement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorIndexer | Class for indexing categorical feature columns in a dataset of Vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorIndexerModel | Transform categorical features to use 0-based indices instead of their original values. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Vectors | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
VectorSlicer | This class takes a feature vector and outputs a new feature vector with a subarray of the The subset of features can be specified with either indices (setIndices()) | Class | org.apache.spark.ml.feature | Apache Spark |
|
VectorTransformer | | Interface | org.apache.spark.mllib.feature | Apache Spark |
|
VectorUDT | :: AlphaComponent :: User-defined type for Vector which allows easy interaction with SQL | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
VocabWord | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
WeibullGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
Word2Vec | Word2Vec trains a model of Map(String, Vector), i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Word2Vec | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
Word2VecModel | Model fitted by Word2Vec. | Class | org.apache.spark.ml.feature | Apache Spark |
|
Word2VecModel | | Class | org.apache.spark.mllib.feature | Apache Spark |