mars.dataframe.DataFrame.transform¶
- DataFrame.transform(func, axis=0, *args, dtypes=None, **kwargs)¶
Call
func
on self producing a DataFrame with transformed values.Produced DataFrame will have same axis length as self.
- 参数
func (function, str, list or dict) –
Function to use for transforming the data. If a function, must either work when passed a DataFrame or when passed to DataFrame.apply.
Accepted combinations are:
function
string function name
list of functions and/or function names, e.g.
[np.exp. 'sqrt']
dict of axis labels -> functions, function names or list of such.
axis ({0 or 'index', 1 or 'columns'}, default 0) – If 0 or ‘index’: apply function to each column. If 1 or ‘columns’: apply function to each row.
dtypes (Series, default None) – Specify dtypes of returned DataFrames. See Notes for more details.
*args – Positional arguments to pass to func.
**kwargs – Keyword arguments to pass to func.
- 返回
A DataFrame that must have the same length as self.
- 返回类型
:raises ValueError : If the returned DataFrame has a different length than self.:
参见
DataFrame.agg
Only perform aggregating type operations.
DataFrame.apply
Invoke function on a DataFrame.
提示
When deciding output dtypes and shape of the return value, Mars will try applying
func
onto a mock DataFrame and the apply call may fail. When this happens, you need to specify a list or a pandas Series asdtypes
of output DataFrame.实际案例
>>> import mars.tensor as mt >>> import mars.dataframe as md >>> df = md.DataFrame({'A': range(3), 'B': range(1, 4)}) >>> df.execute() A B 0 0 1 1 1 2 2 2 3 >>> df.transform(lambda x: x + 1).execute() A B 0 1 2 1 2 3 2 3 4
Even though the resulting DataFrame must have the same length as the input DataFrame, it is possible to provide several input functions:
>>> s = md.Series(range(3)) >>> s.execute() 0 0 1 1 2 2 dtype: int64 >>> s.transform([mt.sqrt, mt.exp]).execute() sqrt exp 0 0.000000 1.000000 1 1.000000 2.718282 2 1.414214 7.389056