mars.dataframe.DataFrame#
- class mars.dataframe.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False, chunk_size=None, gpu=None, sparse=None, num_partitions=None)[source]#
- __init__(data=None, index=None, columns=None, dtype=None, copy=False, chunk_size=None, gpu=None, sparse=None, num_partitions=None)[source]#
Methods
__init__([data, index, columns, dtype, ...])abs()add(other[, axis, level, fill_value])Get Addition of dataframe and other, element-wise (binary operator add).
add_prefix(prefix)Prefix labels with string prefix.
add_suffix(suffix)Suffix labels with string suffix.
agg([func, axis])aggregate([func, axis])align(other[, join, axis, level, copy, ...])Align two objects on their axes with the specified join method.
all([axis, bool_only, skipna, level, ...])any([axis, bool_only, skipna, level, ...])append(other[, ignore_index, ...])apply(func[, axis, raw, result_type, args, ...])Apply a function along an axis of the DataFrame.
assign(**kwargs)Assign new columns to a DataFrame.
astype(dtype[, copy, errors])Cast a pandas object to a specified dtype
dtype.backfill([axis, inplace, limit, downcast])Synonym for
DataFrame.fillna()withmethod='bfill'.bfill([axis, inplace, limit, downcast])Synonym for
DataFrame.fillna()withmethod='bfill'.cartesian_chunk(right, func[, skip_infer, args])copy()copy_from(obj)copy_to(target)corr([method, min_periods])Compute pairwise correlation of columns, excluding NA/null values.
corrwith(other[, axis, drop, method])Compute pairwise correlation.
count([axis, level, numeric_only, combine_size])cummax([axis, skipna])cummin([axis, skipna])cumprod([axis, skipna])cumsum([axis, skipna])describe([percentiles, include, exclude])diff([periods, axis])First discrete difference of element.
div(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator truediv).
dot(other)Compute the matrix multiplication between the DataFrame and other.
drop([labels, axis, index, columns, level, ...])Drop specified labels from rows or columns.
drop_duplicates([subset, keep, inplace, ...])Return DataFrame with duplicate rows removed.
dropna([axis, how, thresh, subset, inplace])Remove missing values.
duplicated([subset, keep, method])Return boolean Series denoting duplicate rows.
eq(other[, axis, level])Get Equal to of dataframe and other, element-wise (binary operator eq).
eval(expr[, inplace])Evaluate a string describing operations on DataFrame columns.
ewm([com, span, halflife, alpha, ...])Provide exponential weighted functions.
execute([session])expanding([min_periods, center, axis])Provide expanding transformations.
explode(column[, ignore_index])Transform each element of a list-like to a row, replicating index values.
ffill([axis, inplace, limit, downcast])Synonym for
DataFrame.fillna()withmethod='ffill'.fillna([value, method, axis, inplace, ...])Fill NA/NaN values using the specified method.
floordiv(other[, axis, level, fill_value])Get Integer division of dataframe and other, element-wise (binary operator floordiv).
from_records(records, **kw)from_tensor(in_tensor[, index, columns])ge(other[, axis, level])Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).
groupby([by, level, as_index, sort, group_keys])gt(other[, axis, level])Get Greater than of dataframe and other, element-wise (binary operator gt).
head([n])Return the first n rows.
insert(loc, column, value[, allow_duplicates])Insert column into DataFrame at specified location.
isin(values)Whether each element in the DataFrame is contained in values.
isna()Detect missing values.
isnull()Detect missing values.
iterrows([batch_size, session])Iterate over DataFrame rows as (index, Series) pairs.
itertuples([index, name, batch_size, session])Iterate over DataFrame rows as namedtuples.
join(other[, on, how, lsuffix, rsuffix, ...])Join columns of another DataFrame.
keys()Get the 'info axis' (see Indexing for more).
kurt([axis, skipna, level, numeric_only, ...])kurtosis([axis, skipna, level, ...])le(other[, axis, level])Get Less than or equal to of dataframe and other, element-wise (binary operator le).
lt(other[, axis, level])Get Less than of dataframe and other, element-wise (binary operator lt).
map_chunk(func[, args, kwargs, skip_infer])Apply function to each chunk.
mask(cond[, other, inplace, axis, level, ...])Replace values where the condition is True.
max([axis, skipna, level, numeric_only, ...])mean([axis, skipna, level, numeric_only, ...])melt([id_vars, value_vars, var_name, ...])Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.
memory_usage([index, deep])Return the memory usage of each column in bytes.
merge(right[, how, on, left_on, right_on, ...])Merge DataFrame or named Series objects with a database-style join.
min([axis, skipna, level, numeric_only, ...])mod(other[, axis, level, fill_value])Get Modulo of dataframe and other, element-wise (binary operator mod).
mul(other[, axis, level, fill_value])Get Multiplication of dataframe and other, element-wise (binary operator mul).
multiply(other[, axis, level, fill_value])Get Multiplication of dataframe and other, element-wise (binary operator mul).
ne(other[, axis, level])Get Not equal to of dataframe and other, element-wise (binary operator ne).
notna()Detect existing (non-missing) values.
notnull()Detect existing (non-missing) values.
nunique([axis, dropna, combine_size])Count distinct observations over requested axis.
pad([axis, inplace, limit, downcast])Synonym for
DataFrame.fillna()withmethod='ffill'.pct_change([periods, fill_method, limit, freq])Percentage change between the current and a prior element.
pop(item)Return item and drop from frame.
pow(other[, axis, level, fill_value])Get Exponential power of dataframe and other, element-wise (binary operator pow).
prod([axis, skipna, level, min_count, ...])product([axis, skipna, level, min_count, ...])quantile([q, axis, numeric_only, interpolation])Return values at the given quantile over requested axis.
query(expr[, inplace])Query the columns of a DataFrame with a boolean expression.
radd(other[, axis, level, fill_value])Get Addition of dataframe and other, element-wise (binary operator radd).
rdiv(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
rebalance([factor, axis, num_partitions, ...])Make Data more balanced across entire cluster.
rechunk(chunk_size[, reassign_worker])reindex(*args, **kwargs)Conform Series/DataFrame to new index with optional filling logic.
reindex_like(other[, method, copy, limit, ...])Return an object with matching indices as other object.
rename([mapper, index, columns, axis, copy, ...])Alter axes labels.
rename_axis([mapper, index, columns, axis, ...])Set the name of the axis for the index or columns.
replace([to_replace, value, inplace, limit, ...])Replace values given in to_replace with value.
reset_index([level, drop, inplace, ...])Reset the index, or a level of it.
rfloordiv(other[, axis, level, fill_value])Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).
rmod(other[, axis, level, fill_value])Get Modulo of dataframe and other, element-wise (binary operator rmod).
rmul(other[, axis, level, fill_value])Get Multiplication of dataframe and other, element-wise (binary operator rmul).
rolling(window[, min_periods, center, ...])Provide rolling window calculations.
round([decimals])Round a DataFrame to a variable number of decimal places.
rpow(other[, axis, level, fill_value])Get Exponential power of dataframe and other, element-wise (binary operator rpow).
rsub(other[, axis, level, fill_value])Get Subtraction of dataframe and other, element-wise (binary operator rsubtract).
rtruediv(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
sample([n, frac, replace, weights, ...])Return a random sample of items from an axis of object.
select_dtypes([include, exclude])Return a subset of the DataFrame's columns based on the column dtypes.
sem([axis, skipna, level, ddof, ...])set_axis(labels[, axis, inplace])Assign desired index to given axis.
set_index(keys[, drop, append, inplace, ...])shift([periods, freq, axis, fill_value])Shift index by desired number of periods with an optional time freq.
skew([axis, skipna, level, numeric_only, ...])sort_index([axis, level, ascending, ...])Sort object by labels (along an axis).
sort_values(by[, axis, ascending, inplace, ...])Sort by the values along either axis.
stack([level, dropna])Stack the prescribed level(s) from columns to index.
std([axis, skipna, level, ddof, ...])sub(other[, axis, level, fill_value])Get Subtraction of dataframe and other, element-wise (binary operator subtract).
sum([axis, skipna, level, min_count, ...])tail([n])Return the last n rows.
tiles()to_cpu()to_csv(path[, sep, na_rep, float_format, ...])Write object to a comma-separated values (csv) file.
to_gpu()to_pandas([session])to_parquet(path[, engine, compression, ...])Write a DataFrame to the binary parquet format, each chunk will be written to a Parquet file.
to_sql(name, con[, schema, if_exists, ...])Write records stored in a DataFrame to a SQL database.
to_tensor()to_vineyard([vineyard_socket])transform(func[, axis, dtypes, skip_infer])Call
funcon self producing a DataFrame with transformed values.Transpose index and columns.
truediv(other[, axis, level, fill_value])Get Floating division of dataframe and other, element-wise (binary operator truediv).
tshift([periods, freq, axis])Shift the time index, using the index's frequency if available.
var([axis, skipna, level, ddof, ...])where(cond[, other, inplace, axis, level, ...])Replace values where the condition is False.
Attributes
Access a single value for a row/column label pair.
dataReturn the dtypes in the DataFrame.
Return an int representing the number of axes / array dimensions.
sizetype_namevalues