mars.dataframe.Series.str.partition#

Series.str.partition(sep=' ', expand=True)#

Split the string at the first occurrence of sep.

This method splits the string at the first occurrence of sep, and returns 3 elements containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return 3 elements containing the string itself, followed by two empty strings.

Parameters

sep (str, default whitespace) – String to split on.
expand (bool, default True) – If True, return DataFrame/MultiIndex expanding dimensionality. If False, return Series/Index.

Return type

DataFrame/MultiIndex or Series/Index of objects

See also

rpartition: Split the string at the last occurrence of sep.
Series.str.split: Split strings around given separators.
str.partition: Standard library version.

Examples

>>> import mars.dataframe as md
>>> s = md.Series(['Linda van der Berg', 'George Pitt-Rivers'])
>>> s.execute()
0    Linda van der Berg
1    George Pitt-Rivers
dtype: object

>>> s.str.partition().execute()
        0  1             2
0   Linda     van der Berg
1  George      Pitt-Rivers

To partition by the last space instead of the first one:

>>> s.str.rpartition().execute()
               0  1            2
0  Linda van der            Berg
1         George     Pitt-Rivers

To partition by something different than a space:

>>> s.str.partition('-').execute()
                    0  1       2
0  Linda van der Berg
1         George Pitt  -  Rivers

To return a Series containing tuples instead of a DataFrame:

>>> s.str.partition('-', expand=False).execute()
0    (Linda van der Berg, , )
1    (George Pitt, -, Rivers)
dtype: object

Also available on indices:

>>> idx = md.Index(['X 123', 'Y 999'])
>>> idx.execute()
Index(['X 123', 'Y 999'], dtype='object')

Which will create a MultiIndex:

>>> idx.str.partition().execute()
MultiIndex([('X', ' ', '123'),
            ('Y', ' ', '999')],
           )

Or an index with tuples with expand=False:

>>> idx.str.partition(expand=False).execute()
Index([('X', ' ', '123'), ('Y', ' ', '999')], dtype='object')