You can create mars tensor from Python array like object just like Numpy, or create from Numpy array directly. More details on array creation routine and random sampling.
mars.tensor.tensor
mars.tensor.array
Create a tensor.
Mars tensor can run on GPU, for tensor creation, just add a gpu parameter, and set it to True.
gpu
True
import mars.tensor as mt a = mt.random.rand(1000, 2000, gpu=True) # allocate the tensor on GPU
Mars tensor can be sparse, unfortunately, only 2-D sparse tensors are supported for now, multi-dimensional tensor will be supported later soon.
import mars.tensor as mt a = mt.eye(1000, sparse=True) # create a sparse 2-D tensor with ones on the diagonal and zeros elsewhere
In mars tensor, we tile a tensor into small chunks. Argument chunk_size is not always required, a chunk’s bytes occupation will be 128M for the default setting. However, user can specify each chunk’s size in a more flexible way which may be adaptive to the data scale. The fact is that chunk’s size may effect heavily on the performance of execution.
chunk_size
The options or arguments which will effect the chunk’s size are listed below:
Change options.tensor.chunk_size_limit which is 128*1024*1024(128M) by default.
options.tensor.chunk_size_limit
Specify chunk_size as integer, like 5000, means chunk’s size is 5000 at most for all dimensions
5000
Specify chunk_size as tuple, like (5000, 3000)
(5000, 3000)
Explicitly define sizes of all chunks along all dimensions, like ((5000, 5000, 2000), (2000, 1000))
((5000, 5000, 2000), (2000, 1000))
Assume we have such a tensor with the data shown below.
0 9 6 7 6 6 5 7 5 6 9 0 1 6 7 8 6 1 8 0 9 9 9 3 5 4 3 5 8 2 6 2 2 6 9 3 4 2 4 6 2 0 6 8 2 6 5 4
We will show how different chunk_size arguments will tile the tensor.
chunk_size=3:
chunk_size=3
chunk_size=2:
chunk_size=2
chunk_size=(3, 2):
chunk_size=(3, 2)
chunk_size=((3, 1, 2, 2), (3, 2, 1)):
chunk_size=((3, 1, 2, 2), (3, 2, 1))