DataFrame.
select_dtypes
Return a subset of the DataFrame’s columns based on the column dtypes.
include (scalar or list-like) – A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
exclude (scalar or list-like) – A selection of dtypes or strings to be included/excluded. At least one of these parameters must be supplied.
The subset of the frame including the dtypes in include and excluding the dtypes in exclude.
include
exclude
DataFrame
ValueError –
If both of include and exclude are empty * If include and exclude have overlapping elements * If any kind of string dtype is passed in.
参见
DataFrame.dtypes
Return Series with the data type of each column.
提示
To select all numeric types, use np.number or 'number'
np.number
'number'
To select strings you must use the object dtype, but note that this will return all object dtype columns
object
See the numpy dtype hierarchy
To select datetimes, use np.datetime64, 'datetime' or 'datetime64'
np.datetime64
'datetime'
'datetime64'
To select timedeltas, use np.timedelta64, 'timedelta' or 'timedelta64'
np.timedelta64
'timedelta'
'timedelta64'
To select Pandas categorical dtypes, use 'category'
'category'
To select Pandas datetimetz dtypes, use 'datetimetz' (new in 0.20.0) or 'datetime64[ns, tz]'
'datetimetz'
'datetime64[ns, tz]'
实际案例
>>> import mars.dataframe as md >>> df = md.DataFrame({'a': [1, 2] * 3, ... 'b': [True, False] * 3, ... 'c': [1.0, 2.0] * 3}) >>> df.execute() a b c 0 1 True 1.0 1 2 False 2.0 2 1 True 1.0 3 2 False 2.0 4 1 True 1.0 5 2 False 2.0
>>> df.select_dtypes(include='bool').execute() b 0 True 1 False 2 True 3 False 4 True 5 False
>>> df.select_dtypes(include=['float64']).execute() c 0 1.0 1 2.0 2 1.0 3 2.0 4 1.0 5 2.0
>>> df.select_dtypes(exclude=['int64']).execute() b c 0 True 1.0 1 False 2.0 2 True 1.0 3 False 2.0 4 True 1.0 5 False 2.0