Mars Learn¶
这是 Mars learn 的类和函参考。
Clustering¶
Classes¶
|
K-Means clustering. |
Functions¶
|
K-means clustering algorithm. |
数据集¶
样本生成器¶
|
Generate isotropic Gaussian blobs for clustering. |
|
Generate a random n-class classification problem. |
|
Generate a mostly low rank matrix with bell-shaped singular values |
|
Generate a random regression problem. |
矩阵分解¶
|
Principal component analysis (PCA) |
|
Dimensionality reduction using truncated SVD (aka LSA). |
集成方法¶
|
A Bagging classifier. |
|
A Bagging regressor. |
|
Blockwise training and ensemble voting classifier. |
|
Blockwise training and ensemble voting regressor. |
|
Isolation Forest Algorithm. |
线性模型¶
Classical linear regressors¶
|
Ordinary least squares Linear Regression. |
评估¶
分类评估¶
|
Accuracy classification score. |
|
Compute Area Under the Curve (AUC) using the trapezoidal rule |
|
Compute the F1 score, also known as balanced F-score or F-measure |
|
Compute the F-beta score |
|
Log loss, aka logistic loss or cross-entropy loss. |
|
Compute a confusion matrix for each class or sample. |
|
Compute the precision |
Compute precision, recall, F-measure and support for each class |
|
|
Compute the recall |
|
Compute Receiver operating characteristic (ROC) |
分类评估¶
|
\(R^2\) (coefficient of determination) regression score function. |
Pairwise 评估¶
|
Compute cosine similarity between samples in X and Y. |
Compute cosine distance between samples in X and Y. |
|
|
Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors. |
Compute the Haversine distance between samples in X and Y |
|
|
Compute the L1 distances between the vectors in X and Y. |
|
Compute the rbf (gaussian) kernel between X and Y. |
|
模型选择¶
划分类¶
|
K-Folds cross-validator |
划分函数¶
|
Split arrays or matrices into random train and test subsets |
最邻近¶
|
预处理和标准化¶
|
Binarize labels in a one-vs-all fashion. |
Encode target labels with value between 0 and n_classes-1. |
|
|
Transform features by scaling each feature to a given range. |
|
Transform features by scaling each feature to a given range. |
|
Binarize labels in a one-vs-all fashion. |
|
Scale input vectors individually to unit norm (vector length). |
半监督学习¶
|
Label Propagation classifier |
工具¶
|
|
|
Input validation for standard estimators. |
|
Input validation on a tensor, list, sparse matrix or similar. |
|
Check that all arrays have consistent first dimensions. |
Determine the type of data indicated by the target. |
|
Check if |
|
|
|
|
Perform is_fitted validation for estimator. |
|
Ravel column or 1d numpy array, else raises an error |
Misc¶
|
Meta-estimator for parallel predict and transform. |
LightGBM 集成¶
|
|
|
|
|
PyTorch 集成¶
|
Run PyTorch script in Mars cluster. |
StatsModels 集成¶
TensorFlow 集成¶
Run TensorFlow script in Mars cluster. |
|
convert mars data type to tf.data.Dataset. |
XGBoost 集成¶
|
|
|
Train XGBoost model in Mars manner. |
|
|
|
Implementation of the scikit-learn API for XGBoost classification. |
|
Implementation of the scikit-learn API for XGBoost regressor. |