| Name | Description | Type | Package | Framework |
| AbsoluteError | Class for absolute error loss calculation (for regression). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| AFTAggregator | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTCostFun | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTSurvivalRegression | Fit a parametric survival regression model named accelerated failure time (AFT) model (https://en. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| AFTSurvivalRegressionModel | Model produced by AFTSurvivalRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| Algo | Enum to select the algorithm for the decision treeSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| ALS | Alternating Least Squares (ALS) matrix factorization. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
| ALS | | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| ALS .Rating | Rating class for better code readability. | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
| ALS .Rating$ | | Class | org.apache.spark.ml.recommendation.ALS | Apache Spark |
|
| ALSModel | Model fitted by ALS. | Class | org.apache.spark.ml.recommendation | Apache Spark |
|
| AssociationRules | Generates association rules from a RDD[FreqItemset[Item]. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| AssociationRules .Rule | An association rule between sets of items. | Class | org.apache.spark.mllib.fpm.AssociationRules | Apache Spark |
|
| Attribute | Abstract class for ML attributes. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| AttributeGroup | Attributes that describe a vector ML column. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| AttributeType | An enum-like type for attribute types: AttributeType$. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| Binarizer | Binarize a column of continuous features given a threshold. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| BinaryAttribute | A binary attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| BinaryClassificationEvaluator | Evaluator for binary classification, which expects two input columns: rawPrediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| BinaryClassificationMetrics | Evaluator for binary classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| BinaryLogisticRegressionSummary | Binary Logistic regression results for a given model. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| BinaryLogisticRegressionTrainingSummary | Logistic regression training results. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| BinarySample | Class that represents the group and value of a sample. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| BisectingKMeans | A bisecting k-means algorithm based on the paper "A comparison of document clustering techniques" by Steinbach, Karypis, and Kumar, with modification to fit Spark. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| BisectingKMeansModel | Clustering model produced by BisectingKMeans. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| BlockMatrix | Represents a distributed matrix in blocks of local matrices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| BooleanParam | Specialized version of Param[Boolean] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| BoostingStrategy | Configuration options for GradientBoostedTrees. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| Bucketizer | Bucketizer maps a column of continuous features to a column of feature buckets. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| CategoricalSplit | Split which tests a categorical feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| ChiSqSelector | Chi-Squared feature selection, which selects categorical features to use for predicting aSee Also:Serialized Form | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ChiSqSelector | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| ChiSqSelectorModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ChiSqSelectorModel | Chi Squared selector model. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| ChiSqTestResult | Object containing the test results for the chi-squared hypothesis test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| ClassificationModel | Model produced by a Classifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| ClassificationModel | Represents a classification model that predicts to which of a set of categories an example belongs. | Interface | org.apache.spark.mllib.classification | Apache Spark |
|
| Classifier | Single-label binary or multiclass classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| ColumnPruner | Utility transformer for removing temporary columns from a DataFrame. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ContinuousSplit | Split which tests a continuous feature. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| CoordinateMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| CountVectorizer | Extracts a vocabulary from document collections and generates a CountVectorizerModel. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| CountVectorizerModel | Converts a text document to a sparse vector of token counts. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| CrossValidator | K-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| CrossValidatorModel | Model from k-fold cross validation. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| DataValidators | A collection of methods used to validate data before applying ML algorithms. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| DCT | A feature transformer that takes the 1D discrete cosine transform of a real vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| DecisionTree | A class which implements a decision tree learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| DecisionTreeClassificationModel | Decision tree model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| DecisionTreeClassifier | Decision tree learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| DecisionTreeModel | Decision tree model for classification or regression. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| DecisionTreeRegressionModel | Decision tree model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| DecisionTreeRegressor | Decision tree learning algorithm It supports both continuous and categorical features. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| DefaultSource | libsvm package implements Spark SQL data source API for loading LIBSVM data as DataFrame. | Class | org.apache.spark.ml.source.libsvm | Apache Spark |
|
| DenseMatrix | Column-major dense matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| DenseVector | A dense vector represented by a value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| DistributedLDAModel | Distributed model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| DistributedLDAModel | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| DistributedMatrix | Represents a distributively stored matrix backed by one or more RDDs. | Interface | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| DoubleArrayParam | Specialized version of Param[Array[Double} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| DoubleParam | Specialized version of Param[Double] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| ElementwiseProduct | Outputs the Hadamard product (i. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| EMLDAOptimizer | Optimizer for EM algorithm which stores data + parameter graph, plus algorithm parameters. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Entropy | Class for calculating entropy during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| Estimator | Abstract class for estimators that fit models to data. | Class | org.apache.spark.ml | Apache Spark |
|
| Evaluator | Abstract class for evaluators that compute metrics from predictions. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| ExpectationSum | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| ExponentialGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| FeatureType | Enum to describe whether a feature is "continuous" or "categorical"See Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| FloatParam | Specialized version of Param[Float] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| FPGrowth | A parallel FP-growth algorithm to mine frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| FPGrowth .FreqItemset | param: items items in this itemset. | Class | org.apache.spark.mllib.fpm.FPGrowth | Apache Spark |
|
| FPGrowthModel | Model trained by FPGrowth, which holds frequent itemsets. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| GammaGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| GaussianMixture | This class performs expectation maximization for multivariate Gaussian Mixture Models (GMMs). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| GaussianMixtureModel | Multivariate Gaussian Mixture Model (GMM) consisting of k Gaussians, where points are drawn from each Gaussian i=1. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| GBTClassificationModel | Gradient-Boosted Trees (GBTs) model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| GBTClassifier | Gradient-Boosted Trees (GBTs) learning algorithm for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| GBTRegressionModel | Gradient-Boosted Trees (GBTs) model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| GBTRegressor | Gradient-Boosted Trees (GBTs) learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| GeneralizedLinearAlgorithm | GeneralizedLinearAlgorithm implements methods to train a Generalized Linear Model (GLM). | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| GeneralizedLinearModel | GeneralizedLinearModel (GLM) represents a model trained using GeneralizedLinearAlgorithm. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| Gini | Class for calculating the during binary classification. | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| Gradient | Class used to compute the gradient for a loss function, given a single data point. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| GradientBoostedTrees | A class that implements Stochastic Gradient Boosting | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| GradientBoostedTreesModel | Represents a gradient boosted trees model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| GradientDescent | Class used to solve an optimization problem using Gradient Descent. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| HashingTF | Maps a sequence of terms to their term frequencies using the hashing trick. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| HingeGradient | Compute gradient and loss for a Hinge loss function, as used in SVM binary classification. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| Identifiable | Trait for an object with an immutable unique ID that identifies itself and its derivatives. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| IDF | Compute the Inverse Document Frequency (IDF) given a collection of documents. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| IDF | Inverse document frequency (IDF). | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| IDF .DocumentFrequencyAggregator | Document frequency aggregator. | Class | org.apache.spark.mllib.feature.IDF | Apache Spark |
|
| IDFModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| IDFModel | Represents an IDF model that can transform term frequency vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Impurity | Trait for calculating information gain. | Interface | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| IndexedRow | Represents a row of IndexedRowMatrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| IndexedRowMatrix | | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| IndexToString | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| InformationGainStats | Information gain statistics for each split param: gain information gain value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| IntArrayParam | Specialized version of Param[Array[Int} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| Interaction | Implements the feature interaction transform. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| InternalNode | Internal Decision Tree node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| IntParam | Specialized version of Param[Int] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| IsotonicRegression | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| IsotonicRegression | | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| IsotonicRegressionModel | Model fitted by IsotonicRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| IsotonicRegressionModel | Regression model for isotonic regression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| JavaParams | Java-friendly wrapper for Params. | Class | org.apache.spark.ml.param | Apache Spark |
|
| KernelDensity | Kernel density estimation. | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeans | | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeans | K-means clustering with support for multiple parallel runs and a k-means++ like initialization mode (the k-meansalgorithm by Bahmani et al). | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| KMeansDataGenerator | Generate test data for KMeans. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| KMeansModel | Model fitted by KMeans. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| KMeansModel | A clustering model for K-means. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| KolmogorovSmirnovTestResult | Object containing the test results for the Kolmogorov-Smirnov test. | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| L1Updater | Updater for L1 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LabelConverter | Label to vector converter. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LabeledPoint | Class that represents the features and labels of a data point. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LassoModel | Regression model trained using Lasso. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LassoWithSGD | Train a regression model with L1-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LBFGS | Class used to solve an optimization problem using Limited-memory BFGS. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LDA | Latent Dirichlet Allocation (LDA), a topic model designed for text documents. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| LDAModel | Model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LDAModel | Latent Dirichlet Allocation (LDA) model. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| LDAOptimizer | An LDAOptimizer specifies which optimization/learning/inference algorithm to use, and it can hold optimizer-specific parameters for users to set. | Interface | org.apache.spark.mllib.clustering | Apache Spark |
|
| LeafNode | Decision tree leaf node. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| LeastSquaresAggregator | LeastSquaresAggregator computes the gradient and loss for a Least-squared loss function, as used in linear regression for samples in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LeastSquaresCostFun | LeastSquaresCostFun implements Breeze's DiffFunction[T] for Least Squares cost. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LeastSquaresGradient | Compute gradient and loss for a Least-squared loss function, as used in linear regression. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LinearDataGenerator | Generate sample data used for Linear Data. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| LinearRegression | The learning objective is to minimize the squared error, with regularization. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionModel | Model produced by LinearRegression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionModel | Regression model trained using LinearRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| LinearRegressionSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionTrainingSummary | | Class | org.apache.spark.ml.regression | Apache Spark |
|
| LinearRegressionWithSGD | Train a linear regression model with no regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| Loader | Trait for classes which can load models and transformers from files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
| LocalLDAModel | Local (non-distributed) model fitted by LDA. | Class | org.apache.spark.ml.clustering | Apache Spark |
|
| LocalLDAModel | This model stores only the inferred topics. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| LogisticAggregator | LogisticAggregator computes the gradient and loss for binary logistic loss function, as used in binary classification for instances in sparse or dense vector in a online fashion. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticCostFun | LogisticCostFun implements Breeze's DiffFunction[T] for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticGradient | Compute gradient and loss for a multinomial logistic loss function, as used in multi-class classification (it is also used in binary logistic regression). | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| LogisticRegression | Logistic regression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionDataGenerator | Generate test data for LogisticRegression. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| LogisticRegressionModel | Model produced by LogisticRegression. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionModel | Classification model trained using Multinomial/Binary Logistic Regression. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogisticRegressionSummary | Abstraction for Logistic Regression Results for a given model. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionTrainingSummary | Abstraction for multinomial Logistic Regression Training results. | Interface | org.apache.spark.ml.classification | Apache Spark |
|
| LogisticRegressionWithLBFGS | Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogisticRegressionWithSGD | Train a classification model for Binary Logistic Regression using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| LogLoss | Class for log loss calculation (for classification). | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| LogNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| LongParam | Specialized version of Param[Long] for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| Loss | Trait for adding "pluggable" loss functions for the gradient boosting algorithm. | Interface | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| Losses | | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| Matrices | Factory methods for Matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| Matrix | Trait for a local matrix. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
| MatrixEntry | Represents an entry in an distributed matrix. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| MatrixFactorizationModel | Model representing the result of matrix factorization. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| MFDataGenerator | Generate RDD(s) containing data for Matrix Factorization. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| MinMaxScaler | Rescale each feature individually to a common range [min, max] linearly using column summary statistics, which is also known as min-max normalization or Rescaling. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| MinMaxScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| MLPairRDDFunctions | Machine learning specific Pair RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
| MLReadable | Trait for objects that provide MLReader. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| MLReader | Abstract class for utility classes that can load ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
| MLUtils | Helper methods to load, save and pre-process data used in ML Lib. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| MLWritable | Trait for classes that provide MLWriter. | Interface | org.apache.spark.ml.util | Apache Spark |
|
| MLWriter | Abstract class for utility classes that can save ML instances. | Class | org.apache.spark.ml.util | Apache Spark |
|
| Model | A fitted model, i. | Class | org.apache.spark.ml | Apache Spark |
|
| MulticlassClassificationEvaluator | Evaluator for multiclass classification, which expects two input columns: score and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| MulticlassMetrics | Evaluator for multiclass classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| MultilabelMetrics | Evaluator for multilabel classification. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| MultilayerPerceptronClassificationModel | Classification model based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| MultilayerPerceptronClassifier | Classifier trainer based on the Multilayer Perceptron. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| MultivariateGaussian | This class provides basic functionality for a Multivariate Gaussian (Normal) Distribution. | Class | org.apache.spark.mllib.stat.distribution | Apache Spark |
|
| MultivariateOnlineSummarizer | MultivariateOnlineSummarizer implements MultivariateStatisticalSummary to compute the mean, variance, minimum, maximum, counts, and nonzero counts for instances in sparse or dense vector | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| MultivariateStatisticalSummary | Trait for multivariate statistical summary of a data matrix. | Interface | org.apache.spark.mllib.stat | Apache Spark |
|
| NaiveBayes | Naive Bayes Classifiers. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| NaiveBayes | | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| NaiveBayesModel | Model produced by NaiveBayes param: pi log of class priors, whose dimension is C (number of classes) | Class | org.apache.spark.ml.classification | Apache Spark |
|
| NaiveBayesModel | Model for Naive Bayes Classifiers. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| NGram | A feature transformer that converts the input array of strings into an array of n-grams. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Node | Decision tree node interface. | Class | org.apache.spark.ml.tree | Apache Spark |
|
| Node | Node in a decision tree. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| NominalAttribute | A nominal attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| Normalizer | Normalize a vector to have unit norm using the given p-norm. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Normalizer | Normalizes samples individually to unit L^p^ norm For any 1 <= p < Double. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| NumericAttribute | A numeric attribute with optional summary statistics. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| OneHotEncoder | A one-hot encoder that maps a column of category indices to a column of binary vectors, with at most a single one-value per row that indicates the input category index. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| OneVsRest | Reduction of Multiclass Classification to Binary Classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| OneVsRestModel | Model produced by OneVsRest. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| OnlineLDAOptimizer | An online optimizer for LDA. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Optimizer | Trait for optimization problem solvers. | Interface | org.apache.spark.mllib.optimization | Apache Spark |
|
| Param | A param with self-contained documentation and optionally default value. | Class | org.apache.spark.ml.param | Apache Spark |
|
| ParamGridBuilder | Builder for a param grid used in grid search-based model selection. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| ParamMap | A param to value map. | Class | org.apache.spark.ml.param | Apache Spark |
|
| ParamPair | A param and its value. | Class | org.apache.spark.ml.param | Apache Spark |
|
| Params | | Interface | org.apache.spark.ml.param | Apache Spark |
|
| ParamValidators | Factory methods for common validation functions for Param. | Class | org.apache.spark.ml.param | Apache Spark |
|
| PCA | PCA trains a model to project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PCA | A feature transformer that projects vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| PCAModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PCAModel | Model fitted by PCA that can project vectors to a low-dimensional space using PCA. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Pipeline | A simple pipeline, which acts as an estimator. | Class | org.apache.spark.ml | Apache Spark |
|
| PipelineModel | Represents a fitted pipeline. | Class | org.apache.spark.ml | Apache Spark |
|
| PipelineStage | A stage in a pipeline, either an Estimator or a Transformer. | Class | org.apache.spark.ml | Apache Spark |
|
| PMMLExportable | Export model to the PMML format Predictive Model Markup Language (PMML) is an XML-based file format | Interface | org.apache.spark.mllib.pmml | Apache Spark |
|
| PoissonGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| PolynomialExpansion | Perform feature expansion in a polynomial space. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| PowerIterationClustering | | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| PowerIterationClustering .Assignment | param: cluster assigned cluster idSee Also:Serialized Form | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
| PowerIterationClustering .Assignment$ | | Class | org.apache.spark.mllib.clustering.PowerIterationClustering | Apache Spark |
|
| PowerIterationClusteringModel | Model produced by PowerIterationClustering. | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| Predict | Predicted value for a node param: predict predicted value | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| PredictionModel | Abstraction for a model for prediction tasks (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
| Predictor | Abstraction for prediction problems (regression and classification). | Class | org.apache.spark.ml | Apache Spark |
|
| PrefixSpan | A parallel PrefixSpan algorithm to mine frequent sequential patterns. | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| PrefixSpan .FreqSequence | Represents a frequence sequence. | Class | org.apache.spark.mllib.fpm.PrefixSpan | Apache Spark |
|
| PrefixSpanModel | Model fitted by PrefixSpan param: freqSequences frequent sequences | Class | org.apache.spark.mllib.fpm | Apache Spark |
|
| ProbabilisticClassificationModel | Model produced by a ProbabilisticClassifier. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| ProbabilisticClassifier | Single-label binary or multiclass classifier which can output class conditional probabilities. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| QRDecomposition | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| QuantileDiscretizer | QuantileDiscretizer takes a column with continuous features and outputs a column with binned categorical features. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| QuantileStrategy | Enum for selecting the quantile calculation strategySee Also:Serialized Form | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| RandomDataGenerator | Trait for random data generators that generate i. | Interface | org.apache.spark.mllib.random | Apache Spark |
|
| RandomForest | A class that implements a Random Forest learning algorithm for classification and regression. | Class | org.apache.spark.mllib.tree | Apache Spark |
|
| RandomForestClassificationModel | Random Forest model for classification. | Class | org.apache.spark.ml.classification | Apache Spark |
|
| RandomForestClassifier | Random Forest learning algorithm for It supports both binary and multiclass labels, as well as both continuous and categorical | Class | org.apache.spark.ml.classification | Apache Spark |
|
| RandomForestModel | Represents a random forest model. | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| RandomForestRegressionModel | Random Forest model for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RandomForestRegressor | Random Forest learning algorithm for regression. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RandomRDDs | Generator methods for creating RDDs comprised of i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| RankingMetrics | Evaluator for ranking algorithms. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| Rating | A more compact class to represent a rating than Tuple3[Int, Int, Double]. | Class | org.apache.spark.mllib.recommendation | Apache Spark |
|
| RDDFunctions | Machine learning specific RDD functions. | Class | org.apache.spark.mllib.rdd | Apache Spark |
|
| RegexTokenizer | A regex based tokenizer that extracts tokens either by using the provided regex pattern to split the text (default) or repeatedly matching the regex (if gaps is false). | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RegressionEvaluator | Evaluator for regression, which expects two input columns: prediction and label. | Class | org.apache.spark.ml.evaluation | Apache Spark |
|
| RegressionMetrics | Evaluator for regression. | Class | org.apache.spark.mllib.evaluation | Apache Spark |
|
| RegressionModel | Model produced by a Regressor. | Class | org.apache.spark.ml.regression | Apache Spark |
|
| RegressionModel | | Interface | org.apache.spark.mllib.regression | Apache Spark |
|
| RFormula | Implements the transforms required for fitting a dataset against an R model formula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RFormulaModel | A fitted RFormula. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| RidgeRegressionModel | Regression model trained using RidgeRegression. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| RidgeRegressionWithSGD | Train a regression model with L2-regularization using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| RowMatrix | Represents a row-oriented distributed Matrix with no meaningful row indices. | Class | org.apache.spark.mllib.linalg.distributed | Apache Spark |
|
| Saveable | Trait for models and transformers which may be saved as files. | Interface | org.apache.spark.mllib.util | Apache Spark |
|
| SimpleUpdater | A simple updater for gradient descent *without* any regularization. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| SingularValueDecomposition | Represents singular value decomposition (SVD) factors. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| SparseMatrix | Column-major sparse matrix. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| SparseVector | A sparse vector represented by an index array and an value array. | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| Split | Interface for a "Split," which specifies a test made at a decision tree node to choose the left or right path. | Interface | org.apache.spark.ml.tree | Apache Spark |
|
| Split | Split applied to a feature param: feature feature index | Class | org.apache.spark.mllib.tree.model | Apache Spark |
|
| SQLTransformer | Implements the transformations which are defined by SQL statement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| SquaredError | Class for squared error loss calculation. | Class | org.apache.spark.mllib.tree.loss | Apache Spark |
|
| SquaredL2Updater | Updater for L2 regularized problems. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| StandardNormalGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| StandardScaler | Standardizes features by removing the mean and scaling to unit variance using column summary statistics on the samples in the training set. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StandardScaler | Standardizes features by removing the mean and scaling to unit std using column summary statistics on the samples in the training set. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| StandardScalerModel | | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StandardScalerModel | Represents a StandardScaler model that can transform vectors. | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Statistics | | Class | org.apache.spark.mllib.stat | Apache Spark |
|
| StopWordsRemover | A feature transformer that filters out stop words from input. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Strategy | Stores all the configuration options for tree construction param: algo Learning goal. | Class | org.apache.spark.mllib.tree.configuration | Apache Spark |
|
| StreamingKMeans | StreamingKMeans provides methods for configuring a streaming k-means analysis, training the model on streaming, | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| StreamingKMeansModel | StreamingKMeansModel extends MLlib's KMeansModel for streaming algorithms, so it can keep track of a continuously updated weight | Class | org.apache.spark.mllib.clustering | Apache Spark |
|
| StreamingLinearAlgorithm | StreamingLinearAlgorithm implements methods for continuously training a generalized linear model model on streaming data, | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| StreamingLinearRegressionWithSGD | Train or predict a linear regression model on streaming data. | Class | org.apache.spark.mllib.regression | Apache Spark |
|
| StreamingLogisticRegressionWithSGD | Train or predict a logistic regression model on streaming data. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| StreamingTest | | Class | org.apache.spark.mllib.stat.test | Apache Spark |
|
| StringArrayParam | Specialized version of Param[Array[String} for Java. | Class | org.apache.spark.ml.param | Apache Spark |
|
| StringIndexer | A label indexer that maps a string column of labels to an ML column of label indices. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| StringIndexerModel | Model fitted by StringIndexer. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| SVMDataGenerator | Generate sample data used for SVM. | Class | org.apache.spark.mllib.util | Apache Spark |
|
| SVMModel | Model for Support Vector Machines (SVMs). | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| SVMWithSGD | Train a Support Vector Machine (SVM) using Stochastic Gradient Descent. | Class | org.apache.spark.mllib.classification | Apache Spark |
|
| TestResult | Trait for hypothesis test results. | Interface | org.apache.spark.mllib.stat.test | Apache Spark |
|
| Tokenizer | A tokenizer that converts the input string to lowercase and then splits it by white spaces. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| TrainValidationSplit | Validation for hyper-parameter tuning. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| TrainValidationSplitModel | Model from train validation split. | Class | org.apache.spark.ml.tuning | Apache Spark |
|
| Transformer | Abstract class for transformers that transform one dataset into another. | Class | org.apache.spark.ml | Apache Spark |
|
| UnaryTransformer | Abstract class for transformers that take one input column, apply transformation, and output the result as a new column. | Class | org.apache.spark.ml | Apache Spark |
|
| UniformGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| UnresolvedAttribute | An unresolved attribute. | Class | org.apache.spark.ml.attribute | Apache Spark |
|
| Updater | Class used to perform steps (weight update) using Gradient Descent methods. | Class | org.apache.spark.mllib.optimization | Apache Spark |
|
| Variance | Class for calculating variance during regressionSee Also:Serialized Form | Class | org.apache.spark.mllib.tree.impurity | Apache Spark |
|
| Vector | Represents a numeric vector, whose index type is Int and value type is Double. | Interface | org.apache.spark.mllib.linalg | Apache Spark |
|
| VectorAssembler | A feature transformer that merges multiple columns into a vector column. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorAttributeRewriter | Utility transformer that rewrites Vector attribute names via prefix replacement. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorIndexer | Class for indexing categorical feature columns in a dataset of Vector. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorIndexerModel | Transform categorical features to use 0-based indices instead of their original values. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Vectors | | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| VectorSlicer | This class takes a feature vector and outputs a new feature vector with a subarray of the The subset of features can be specified with either indices (setIndices()) | Class | org.apache.spark.ml.feature | Apache Spark |
|
| VectorTransformer | | Interface | org.apache.spark.mllib.feature | Apache Spark |
|
| VectorUDT | :: AlphaComponent :: User-defined type for Vector which allows easy interaction with SQL | Class | org.apache.spark.mllib.linalg | Apache Spark |
|
| VocabWord | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| WeibullGenerator | Generates i. | Class | org.apache.spark.mllib.random | Apache Spark |
|
| Word2Vec | Word2Vec trains a model of Map(String, Vector), i. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Word2Vec | | Class | org.apache.spark.mllib.feature | Apache Spark |
|
| Word2VecModel | Model fitted by Word2Vec. | Class | org.apache.spark.ml.feature | Apache Spark |
|
| Word2VecModel | | Class | org.apache.spark.mllib.feature | Apache Spark |