mars.dataframe.DataFrame.quantile#
- DataFrame.quantile(q=0.5, axis=0, numeric_only=True, interpolation='linear')#
Return values at the given quantile over requested axis.
- 参数
q (float or array-like, default 0.5 (50% quantile)) – Value between 0 <= q <= 1, the quantile(s) to compute.
axis ({0, 1, 'index', 'columns'} (default 0)) – Equals 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise.
numeric_only (bool, default True) – If False, the quantile of datetime and timedelta data will be computed as well.
interpolation ({'linear', 'lower', 'higher', 'midpoint', 'nearest'}) –
This optional parameter specifies the interpolation method to use, when the desired quantile lies between two data points i and j: * linear: i + (j - i) * fraction, where fraction is the
fractional part of the index surrounded by i and j.
lower: i.
higher: j.
nearest: i or j whichever is nearest.
midpoint: (i + j) / 2.
- 返回
- If
q
is an array or a tensor, a DataFrame will be returned where the index is
q
, the columns are the columns of self, and the values are the quantiles.- If
q
is a float, a Series will be returned where the index is the columns of self and the values are the quantiles.
- If
- 返回类型
参见
core.window.Rolling.quantile
Rolling quantile.
numpy.percentile
Numpy function to compute the percentile.
实际案例
>>> import mars.dataframe as md >>> df = md.DataFrame(np.array([[1, 1], [2, 10], [3, 100], [4, 100]]), ... columns=['a', 'b']) >>> df.quantile(.1).execute() a 1.3 b 3.7 Name: 0.1, dtype: float64
>>> df.quantile([.1, .5]).execute() a b 0.1 1.3 3.7 0.5 2.5 55.0
Specifying numeric_only=False will also compute the quantile of datetime and timedelta data.
>>> df = md.DataFrame({'A': [1, 2], ... 'B': [md.Timestamp('2010'), ... md.Timestamp('2011')], ... 'C': [md.Timedelta('1 days'), ... md.Timedelta('2 days')]}) >>> df.quantile(0.5, numeric_only=False).execute() A 1.5 B 2010-07-02 12:00:00 C 1 days 12:00:00 Name: 0.5, dtype: object