mars.dataframe.groupby.GroupBy.apply#
- GroupBy.apply(func, *args, output_type=None, dtypes=None, dtype=None, name=None, index=None, **kwargs)#
Apply function func group-wise and combine the results together.
The function passed to apply must take a dataframe as its first argument and return a DataFrame, Series or scalar. apply will then take care of combining the results back together into a single dataframe or series. apply is therefore a highly flexible grouping method.
While apply is a very flexible method, its downside is that using it can be quite a bit slower than using more specific methods like agg or transform. Pandas offers a wide range of method that will be much faster than using apply for their specific purposes, so try to use them before reaching for apply.
- 参数
func (callable) – A callable that takes a dataframe as its first argument, and returns a dataframe, a series or a scalar. In addition the callable may take positional and keyword arguments.
output_type ({'dataframe', 'series'}, default None) – Specify type of returned object. See Notes for more details.
dtypes (Series, default None) – Specify dtypes of returned DataFrames. See Notes for more details.
dtype (numpy.dtype, default None) – Specify dtype of returned Series. See Notes for more details.
name (str, default None) – Specify name of returned Series. See Notes for more details.
index (Index, default None) – Specify index of returned object. See Notes for more details.
args (tuple and dict) – Optional positional and keyword arguments to pass to func.
kwargs (tuple and dict) – Optional positional and keyword arguments to pass to func.
- 返回
applied
- 返回类型
参见
pipe
Apply function to the full GroupBy object instead of to each group.
aggregate
Apply aggregate function to the GroupBy object.
transform
Apply function column-by-column to the GroupBy object.
Series.apply
Apply a function to a Series.
DataFrame.apply
Apply a function to each row or column of a DataFrame.
提示
When deciding output dtypes and shape of the return value, Mars will try applying
func
onto a mock grouped object, and the apply call may fail. When this happens, you need to specify the type of apply call (DataFrame or Series) in output_type.For DataFrame output, you need to specify a list or a pandas Series as
dtypes
of output DataFrame.index
of output can also be specified.For Series output, you need to specify
dtype
andname
of output Series.