mars.dataframe.melt#

mars.dataframe.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)[source]#

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’. .. versionadded:: 0.20.0

Parameters
  • id_vars (tuple, list, or ndarray, optional) – Column(s) to use as identifier variables.

  • value_vars (tuple, list, or ndarray, optional) – Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.

  • var_name (scalar) – Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.

  • value_name (scalar, default 'value') – Name to use for the ‘value’ column.

  • col_level (int or str, optional) – If columns are a MultiIndex then use this level to melt.

Returns

Unpivoted DataFrame.

Return type

DataFrame

See also

melt, pivot_table, DataFrame.pivot, Series.explode

Examples

>>> import mars.dataframe as md
>>> df = md.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
...                    'B': {0: 1, 1: 3, 2: 5},
...                    'C': {0: 2, 1: 4, 2: 6}})
>>> df.execute()
   A  B  C
0  a  1  2
1  b  3  4
2  c  5  6
>>> df.melt(id_vars=['A'], value_vars=['B']).execute()
   A variable  value
0  a        B      1
1  b        B      3
2  c        B      5
>>> df.melt(id_vars=['A'], value_vars=['B', 'C']).execute()
   A variable  value
0  a        B      1
1  b        B      3
2  c        B      5
3  a        C      2
4  b        C      4
5  c        C      6

The names of ‘variable’ and ‘value’ columns can be customized:

>>> df.melt(id_vars=['A'], value_vars=['B'],
...         var_name='myVarname', value_name='myValname').execute()
   A myVarname  myValname
0  a         B          1
1  b         B          3
2  c         B          5

If you have multi-index columns:

>>> df = md.DataFrame({('A', 'D'): {0: 'a', 1: 'b', 2: 'c'},
...                    ('B', 'E'): {0: 1, 1: 3, 2: 5},
...                    ('C', 'F'): {0: 2, 1: 4, 2: 6}})
>>> df.execute()
   A  B  C
   D  E  F
0  a  1  2
1  b  3  4
2  c  5  6
>>> df.melt(col_level=0, id_vars=['A'], value_vars=['B']).execute()
   A variable  value
0  a        B      1
1  b        B      3
2  c        B      5
>>> df.melt(id_vars=[('A', 'D')], value_vars=[('B', 'E')]).execute()
  (A, D) variable_0 variable_1  value
0      a          B          E      1
1      b          B          E      3
2      c          B          E      5