Mars Learn#

Mars learn mimics scikit-learn API and leverages the ability of Mars tensor and DataFrame to process large data and execute in parallel.

Let’s take mars.learn.neighbors.NearestNeighbors as an example.

>>> import mars.tensor as mt
>>> from mars.learn.neighbors import NearestNeighbors
>>> data = mt.random.rand(100, 3)
>>> nn = NearestNeighbors(n_neighbors=3)
>>> nn.fit(data)
NearestNeighbors(algorithm='auto', leaf_size=30, metric='minkowski',
                 metric_params=None, n_neighbors=3, p=2, radius=1.0)
>>> neighbors = nn.kneighbors(df)
>>> neighbors
(array([[0.0560703 , 0.1836808 , 0.19055679],
        [0.07100642, 0.08550266, 0.10617568],
        [0.13348483, 0.16597596, 0.20287617]]),
 array([[91, 10, 29],
        [68, 77, 29],
        [63, 82, 21]]))

Remember that functions like fit, predict will trigger execution instantly. In the above example, fit and kneighbors will trigger execution internally.

For implemented learn API, refer to learn API reference.

Mars learn can integrate with XGBoost, LightGBM, TensorFlow and PyTorch.