Mars Learn

This is the class and function reference of Mars learn.

Clustering

Classes

cluster.KMeans([n_clusters, init, n_init, …])

K-Means clustering.

Functions

cluster.k_means(X, n_clusters[, …])

K-means clustering algorithm.

Datasets

Samples generator

datasets.make_blobs([n_samples, n_features, …])

Generate isotropic Gaussian blobs for clustering.

datasets.make_classification([n_samples, …])

Generate a random n-class classification problem.

datasets.make_low_rank_matrix([n_samples, …])

Generate a mostly low rank matrix with bell-shaped singular values

Matrix Decomposition

decomposition.PCA([n_components, copy, …])

Principal component analysis (PCA)

decomposition.TruncatedSVD([n_components, …])

Dimensionality reduction using truncated SVD (aka LSA).

Metrics

Classification metrics

metrics.accuracy_score(y_true, y_pred[, …])

Accuracy classification score.

metrics.auc(x, y[, session, run_kwargs])

Compute Area Under the Curve (AUC) using the trapezoidal rule

metrics.roc_curve(y_true, y_score[, …])

Compute Receiver operating characteristic (ROC)

Pairwise metrics

metrics.pairwise.cosine_similarity(X[, Y, …])

Compute cosine similarity between samples in X and Y.

metrics.pairwise.cosine_distances(X[, Y])

Compute cosine distance between samples in X and Y.

metrics.pairwise.euclidean_distances(X[, Y, …])

Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors.

metrics.pairwise.haversine_distances(X[, Y])

Compute the Haversine distance between samples in X and Y

metrics.pairwise.manhattan_distances(X[, Y, …])

Compute the L1 distances between the vectors in X and Y.

metrics.pairwise.rbf_kernel(X[, Y, gamma])

Compute the rbf (gaussian) kernel between X and Y.

metrics.pairwise_distances(X[, Y, metric])

Splitter Functions

model_selection.train_test_split(*arrays, …)

Split arrays or matrices into random train and test subsets

Nearest Neighbors

neighbors.NearestNeighbors([n_neighbors, …])

Preprocessing and Normalization

preprocessing.normalize(X[, norm, axis, …])

Scale input vectors individually to unit norm (vector length).

Semi-Supervised Learning

semi_supervised.LabelPropagation([kernel, …])

Label Propagation classifier

Utilities

utils.assert_all_finite(X[, allow_nan, …])

utils.check_X_y(X, y[, accept_sparse, …])

Input validation for standard estimators.

utils.check_array(array[, accept_sparse, …])

Input validation on a tensor, list, sparse matrix or similar.

utils.check_consistent_length(*arrays[, …])

Check that all arrays have consistent first dimensions.

utils.multiclass.type_of_target(y)

Determine the type of data indicated by the target.

utils.multiclass.is_multilabel(y)

Check if y is in a multilabel format.

utils.shuffle(*arrays, **options)

utils.validation.check_is_fitted(estimator)

Perform is_fitted validation for estimator.

utils.validation.column_or_1d(y[, warn])

Ravel column or 1d numpy array, else raises an error

TensorFlow Integration

contrib.tensorflow.run_tensorflow_script(…)

Run TensorFlow script in Mars cluster.

XGBoost Integration

contrib.xgboost.MarsDMatrix(data[, label, …])

contrib.xgboost.train(params, dtrain[, evals])

Train XGBoost model in Mars manner.

contrib.xgboost.predict(model, data[, …])

contrib.xgboost.XGBClassifier

contrib.xgboost.XGBRegressor