.. _integrate_joblib: ********************* Integrate with joblib ********************* `Joblib `_ is a library integrated with `scikit-learn `_ to make machine learning jobs parallel. We create a backend for joblib with :doc:`Mars remote ` and users can make their scikit-learn tasks parallel with Mars. To enable the backend, you need to register it with the code below. .. code-block:: python from mars.learn.contrib.joblib import register_mars_backend register_mars_backend() After that, it is possible to create a Mars parallel backend with Mars service endpoint or existing Mars session. When nothing specified, default or local session will be used. .. code-block:: python import joblib # create with Mars endpoint with joblib.parallel_backend('mars', service='http://:'): # scikit-learn code # create with existing Mars session sess = new_session('http://:') with joblib.parallel_backend('mars', session=sess): # scikit-learn code A simple example is shown below, where we fit a SVM classifier with randomized search. All you need is to replace the service endpoint in ``joblib.parallel_backend`` with your own service endpoint. .. code-block:: python import joblib import sklearn from sklearn.datasets import load_digits from sklearn.model_selection import RandomizedSearchCV from sklearn.svm import SVC from mars.learn.contrib.joblib import register_mars_backend register_mars_backend() digits = load_digits() param_space = { 'C': np.logspace(-6, 6, 30), 'gamma': np.logspace(-8, 8, 30), 'tol': np.logspace(-4, -1, 30), 'class_weight': [None, 'balanced'], } model = SVC(kernel='rbf') search = RandomizedSearchCV(model, param_space, cv=5, n_iter=10, verbose=10) with joblib.parallel_backend('mars', service='http://:'): search.fit(digits.data, digits.target) Note that joblib can only be used with data small enough to be held inside a single machine. For huge datasets, please use learning algorithms implemented with Mars objects.