mars.learn.ensemble.BaggingRegressor#

class mars.learn.ensemble.BaggingRegressor(base_estimator=None, n_estimators=10, *, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=None, random_state=None, verbose=0, reducers=1.0)[source]#

A Bagging regressor.

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction. Such a meta-estimator can typically be used as a way to reduce the variance of a black-box estimator (e.g., a decision tree), by introducing randomization into its construction procedure and then making an ensemble out of it.

This algorithm encompasses several works from the literature. When random subsets of the dataset are drawn as random subsets of the samples, then this algorithm is known as Pasting 1. If samples are drawn with replacement, then the method is known as Bagging 2. When random subsets of the dataset are drawn as random subsets of the features, then the method is known as Random Subspaces 3. Finally, when base estimators are built on subsets of both samples and features, then the method is known as Random Patches 4.

Read more in the User Guide.

Parameters
  • base_estimator (object, default=None) – The base estimator to fit on random subsets of the dataset. If None, then the base estimator is a DecisionTreeRegressor.

  • n_estimators (int, default=10) – The number of base estimators in the ensemble.

  • max_samples (int or float, default=1.0) –

    The number of samples to draw from X to train each base estimator (with replacement by default, see bootstrap for more details).

    • If int, then draw max_samples samples.

    • If float, then draw max_samples * X.shape[0] samples.

  • max_features (int or float, default=1.0) –

    The number of features to draw from X to train each base estimator ( without replacement by default, see bootstrap_features for more details).

    • If int, then draw max_features features.

    • If float, then draw max_features * X.shape[1] features.

  • bootstrap (bool, default=True) – Whether samples are drawn with replacement. If False, sampling without replacement is performed.

  • bootstrap_features (bool, default=False) – Whether features are drawn with replacement.

  • warm_start (bool, default=False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble. See the Glossary.

  • random_state (int, RandomState instance or None, default=None) – Controls the random resampling of the original dataset (sample wise and feature wise). If the base estimator accepts a random_state attribute, a different seed is generated for each instance in the ensemble. Pass an int for reproducible output across multiple function calls. See Glossary.

base_estimator_#

The base estimator from which the ensemble is grown.

Type

estimator

estimators_#

The collection of fitted sub-estimators.

Type

list of estimators

estimators_features_#

The subset of drawn features for each base estimator.

Type

list of arrays

See also

BaggingClassifier

A Bagging classifier.

References

1

L. Breiman, “Pasting small votes for classification in large databases and on-line”, Machine Learning, 36(1), 85-103, 1999.

2

L. Breiman, “Bagging predictors”, Machine Learning, 24(2), 123-140, 1996.

3

T. Ho, “The random subspace method for constructing decision forests”, Pattern Analysis and Machine Intelligence, 20(8), 832-844, 1998.

4

G. Louppe and P. Geurts, “Ensembles on Random Patches”, Machine Learning and Knowledge Discovery in Databases, 346-361, 2012.

Examples

>>> from sklearn.svm import SVR
>>> from mars.learn.ensemble import BaggingRegressor
>>> from mars.learn.datasets import make_regression
>>> X, y = make_regression(n_samples=100, n_features=4,
...                        n_informative=2, n_targets=1,
...                        random_state=0, shuffle=False)
>>> regr = BaggingRegressor(base_estimator=SVR(),
...                         n_estimators=10, random_state=0).fit(X, y)
>>> regr.predict([[0, 0, 0, 0]])
array([-2.8720...])
__init__(base_estimator=None, n_estimators=10, *, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=None, random_state=None, verbose=0, reducers=1.0)#

Methods

__init__([base_estimator, n_estimators, ...])

fit(X[, y, sample_weight, session, run_kwargs])

Build a Bagging ensemble of estimators from the training set (X, y).

predict(X[, session, run_kwargs])

Predict regression target for X.

score(X, y[, sample_weight])

Return the coefficient of determination of the prediction.