Mars Learn#
这是 Mars learn 的类和函参考。
Clustering#
Classes#
|
K-Means clustering. |
Functions#
|
K-means clustering algorithm. |
数据集#
样本生成器#
|
Generate isotropic Gaussian blobs for clustering. |
|
Generate a random n-class classification problem. |
|
Generate a mostly low rank matrix with bell-shaped singular values |
|
Generate a random regression problem. |
矩阵分解#
|
Principal component analysis (PCA) |
|
Dimensionality reduction using truncated SVD (aka LSA). |
集成方法#
|
A Bagging classifier. |
|
A Bagging regressor. |
|
Blockwise training and ensemble voting classifier. |
|
Blockwise training and ensemble voting regressor. |
|
Isolation Forest Algorithm. |
线性模型#
Classical linear regressors#
|
Ordinary least squares Linear Regression. |
评估#
分类评估#
|
Accuracy classification score. |
|
Compute Area Under the Curve (AUC) using the trapezoidal rule |
|
Compute the F1 score, also known as balanced F-score or F-measure |
|
Compute the F-beta score |
|
Log loss, aka logistic loss or cross-entropy loss. |
|
Compute a confusion matrix for each class or sample. |
|
Compute the precision |
Compute precision, recall, F-measure and support for each class |
|
|
Compute the recall |
|
Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores. |
|
Compute Receiver operating characteristic (ROC) |
分类评估#
|
\(R^2\) (coefficient of determination) regression score function. |
Pairwise 评估#
|
Compute cosine similarity between samples in X and Y. |
Compute cosine distance between samples in X and Y. |
|
|
Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors. |
Compute the Haversine distance between samples in X and Y |
|
|
Compute the L1 distances between the vectors in X and Y. |
|
Compute the rbf (gaussian) kernel between X and Y. |
|
模型选择#
划分类#
|
K-Folds cross-validator |
划分函数#
|
Split arrays or matrices into random train and test subsets |
最邻近#
|
预处理和标准化#
|
Binarize labels in a one-vs-all fashion. |
Encode target labels with value between 0 and n_classes-1. |
|
|
Transform features by scaling each feature to a given range. |
|
Transform features by scaling each feature to a given range. |
|
Binarize labels in a one-vs-all fashion. |
|
Scale input vectors individually to unit norm (vector length). |
半监督学习#
|
Label Propagation classifier |
工具#
|
|
|
Input validation for standard estimators. |
|
Input validation on a tensor, list, sparse matrix or similar. |
|
Check that all arrays have consistent first dimensions. |
Determine the type of data indicated by the target. |
|
Check if |
|
|
|
|
Perform is_fitted validation for estimator. |
|
Ravel column or 1d numpy array, else raises an error |
Misc#
|
Meta-estimator for parallel predict and transform. |
LightGBM 集成#
|
|
|
|
|
PyTorch 集成#
|
Run PyTorch script in Mars cluster. |
StatsModels 集成#
TensorFlow 集成#
Run TensorFlow script in Mars cluster. |
|
convert mars data type to tf.data.Dataset. |
XGBoost 集成#
|
|
|
Train XGBoost model in Mars manner. |
|
|
|
Implementation of the scikit-learn API for XGBoost classification. |
|
Implementation of the scikit-learn API for XGBoost regressor. |