DataFrame#

Constructor#

DataFrame([data, index, columns, dtype, ...])

Attributes and underlying data#

Axes

DataFrame.index

DataFrame.columns

DataFrame.dtypes

Return the dtypes in the DataFrame.

DataFrame.select_dtypes([include, exclude])

Return a subset of the DataFrame's columns based on the column dtypes.

DataFrame.ndim

Return an int representing the number of axes / array dimensions.

DataFrame.shape

DataFrame.memory_usage([index, deep])

Return the memory usage of each column in bytes.

Conversion#

DataFrame.astype(dtype[, copy, errors])

Cast a pandas object to a specified dtype dtype.

DataFrame.copy()

DataFrame.isna()

Detect missing values.

DataFrame.notna()

Detect existing (non-missing) values.

Indexing, iteration#

DataFrame.head([n])

Return the first n rows.

DataFrame.at

Access a single value for a row/column label pair.

DataFrame.iat

DataFrame.loc

DataFrame.iloc

DataFrame.insert(loc, column, value[, ...])

Insert column into DataFrame at specified location.

DataFrame.iterrows([batch_size, session])

Iterate over DataFrame rows as (index, Series) pairs.

DataFrame.itertuples([index, name, ...])

Iterate over DataFrame rows as namedtuples.

DataFrame.mask(cond[, other, inplace, axis, ...])

Replace values where the condition is True.

DataFrame.pop(item)

Return item and drop from frame.

DataFrame.query(expr[, inplace])

Query the columns of a DataFrame with a boolean expression.

DataFrame.tail([n])

Return the last n rows.

DataFrame.where(cond[, other, inplace, ...])

Replace values where the condition is False.

Binary operator functions#

DataFrame.add(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator add).

DataFrame.sub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator subtract).

DataFrame.mul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

DataFrame.div(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.truediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

DataFrame.floordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator floordiv).

DataFrame.mod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator mod).

DataFrame.pow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

DataFrame.dot(other)

Compute the matrix multiplication between the DataFrame and other.

DataFrame.radd(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator radd).

DataFrame.rsub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator rsubtract).

DataFrame.rmul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator rmul).

DataFrame.rdiv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rtruediv(other[, axis, level, ...])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

DataFrame.rfloordiv(other[, axis, level, ...])

Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).

DataFrame.rmod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator rmod).

DataFrame.rpow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator rpow).

DataFrame.lt(other[, axis, level])

Get Less than of dataframe and other, element-wise (binary operator lt).

DataFrame.gt(other[, axis, level])

Get Greater than of dataframe and other, element-wise (binary operator gt).

DataFrame.le(other[, axis, level])

Get Less than or equal to of dataframe and other, element-wise (binary operator le).

DataFrame.ge(other[, axis, level])

Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).

DataFrame.ne(other[, axis, level])

Get Not equal to of dataframe and other, element-wise (binary operator ne).

DataFrame.eq(other[, axis, level])

Get Equal to of dataframe and other, element-wise (binary operator eq).

Function application, GroupBy & window#

DataFrame.apply(func[, axis, raw, ...])

Apply a function along an axis of the DataFrame.

DataFrame.agg([func, axis])

DataFrame.aggregate([func, axis])

DataFrame.transform(func[, axis, dtypes, ...])

Call func on self producing a DataFrame with transformed values.

DataFrame.groupby([by, level, as_index, ...])

DataFrame.rolling(window[, min_periods, ...])

Provide rolling window calculations.

DataFrame.expanding([min_periods, center, axis])

Provide expanding transformations.

DataFrame.ewm([com, span, halflife, alpha, ...])

Provide exponential weighted functions.

Computations / descriptive stats#

DataFrame.abs()

DataFrame.all([axis, bool_only, skipna, ...])

DataFrame.any([axis, bool_only, skipna, ...])

DataFrame.corr([method, min_periods])

Compute pairwise correlation of columns, excluding NA/null values.

DataFrame.corrwith(other[, axis, drop, method])

Compute pairwise correlation.

DataFrame.count([axis, level, numeric_only, ...])

DataFrame.cummax([axis, skipna])

DataFrame.cummin([axis, skipna])

DataFrame.cumprod([axis, skipna])

DataFrame.cumsum([axis, skipna])

DataFrame.describe([percentiles, include, ...])

DataFrame.eval(expr[, inplace])

Evaluate a string describing operations on DataFrame columns.

DataFrame.kurt([axis, skipna, level, ...])

DataFrame.kurtosis([axis, skipna, level, ...])

DataFrame.max([axis, skipna, level, ...])

DataFrame.mean([axis, skipna, level, ...])

DataFrame.min([axis, skipna, level, ...])

DataFrame.nunique([axis, dropna, combine_size])

Count distinct observations over requested axis.

DataFrame.pct_change([periods, fill_method, ...])

Percentage change between the current and a prior element.

DataFrame.prod([axis, skipna, level, ...])

DataFrame.product([axis, skipna, level, ...])

DataFrame.quantile([q, axis, numeric_only, ...])

Return values at the given quantile over requested axis.

DataFrame.round([decimals])

Round a DataFrame to a variable number of decimal places.

DataFrame.sem([axis, skipna, level, ddof, ...])

DataFrame.skew([axis, skipna, level, ...])

DataFrame.std([axis, skipna, level, ddof, ...])

DataFrame.sum([axis, skipna, level, ...])

DataFrame.var([axis, skipna, level, ddof, ...])

Reindexing / selection / label manipulation#

DataFrame.add_prefix(prefix)

Prefix labels with string prefix.

DataFrame.add_suffix(suffix)

Suffix labels with string suffix.

DataFrame.align(other[, join, axis, level, ...])

Align two objects on their axes with the specified join method.

DataFrame.drop([labels, axis, index, ...])

Drop specified labels from rows or columns.

DataFrame.drop_duplicates([subset, keep, ...])

Return DataFrame with duplicate rows removed.

DataFrame.duplicated([subset, keep, method])

Return boolean Series denoting duplicate rows.

DataFrame.head([n])

Return the first n rows.

DataFrame.reindex(*args, **kwargs)

Conform Series/DataFrame to new index with optional filling logic.

DataFrame.reindex_like(other[, method, ...])

Return an object with matching indices as other object.

DataFrame.rename([mapper, index, columns, ...])

Alter axes labels.

DataFrame.rename_axis([mapper, index, ...])

Set the name of the axis for the index or columns.

DataFrame.reset_index([level, drop, ...])

Reset the index, or a level of it.

DataFrame.sample([n, frac, replace, ...])

Return a random sample of items from an axis of object.

DataFrame.set_axis(labels[, axis, inplace])

Assign desired index to given axis.

DataFrame.set_index(keys[, drop, append, ...])

DataFrame.tail([n])

Return the last n rows.

Missing data handling#

DataFrame.backfill([axis, inplace, limit, ...])

Synonym for DataFrame.fillna() with method='bfill'.

DataFrame.bfill([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='bfill'.

DataFrame.dropna([axis, how, thresh, ...])

Remove missing values.

DataFrame.ffill([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

DataFrame.fillna([value, method, axis, ...])

Fill NA/NaN values using the specified method.

DataFrame.isna()

Detect missing values.

DataFrame.isnull()

Detect missing values.

DataFrame.notna()

Detect existing (non-missing) values.

DataFrame.notnull()

Detect existing (non-missing) values.

DataFrame.pad([axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

DataFrame.replace([to_replace, value, ...])

Replace values given in to_replace with value.

Reshaping, sorting, transposing#

DataFrame.explode(column[, ignore_index])

Transform each element of a list-like to a row, replicating index values.

DataFrame.melt([id_vars, value_vars, ...])

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

DataFrame.sort_values(by[, axis, ascending, ...])

Sort by the values along either axis.

DataFrame.sort_index([axis, level, ...])

Sort object by labels (along an axis).

DataFrame.stack([level, dropna])

Stack the prescribed level(s) from columns to index.

DataFrame.T

DataFrame.transpose()

Transpose index and columns.

Combining / joining / merging#

DataFrame.append(other[, ignore_index, ...])

DataFrame.assign(**kwargs)

Assign new columns to a DataFrame.

DataFrame.join(other[, on, how, lsuffix, ...])

Join columns of another DataFrame.

DataFrame.merge(right[, how, on, left_on, ...])

Merge DataFrame or named Series objects with a database-style join.

Plotting#

DataFrame.plot is both a callable method and a namespace attribute for specific plotting methods of the form DataFrame.plot.<kind>.

DataFrame.plot

alias of PlotAccessor

DataFrame.plot.area(*args, **kwargs)

Draw a stacked area plot.

DataFrame.plot.bar(*args, **kwargs)

Vertical bar plot.

DataFrame.plot.barh(*args, **kwargs)

Make a horizontal bar plot.

DataFrame.plot.box(*args, **kwargs)

Make a box plot of the DataFrame columns.

DataFrame.plot.density(*args, **kwargs)

Generate Kernel Density Estimate plot using Gaussian kernels.

DataFrame.plot.hexbin(*args, **kwargs)

Generate a hexagonal binning plot.

DataFrame.plot.hist(*args, **kwargs)

Draw one histogram of the DataFrame's columns.

DataFrame.plot.kde(*args, **kwargs)

Generate Kernel Density Estimate plot using Gaussian kernels.

DataFrame.plot.line(*args, **kwargs)

Plot Series or DataFrame as lines.

DataFrame.plot.pie(*args, **kwargs)

Generate a pie plot.

DataFrame.plot.scatter(*args, **kwargs)

Create a scatter plot with varying marker point size and color.

Serialization / IO / conversion#

DataFrame.to_csv(path[, sep, na_rep, ...])

Write object to a comma-separated values (csv) file.

DataFrame.to_parquet(path[, engine, ...])

Write a DataFrame to the binary parquet format, each chunk will be written to a Parquet file.

DataFrame.to_sql(name, con[, schema, ...])

Write records stored in a DataFrame to a SQL database.

Misc#

DataFrame.map_chunk(func[, args, kwargs, ...])

Apply function to each chunk.

DataFrame.rebalance([factor, axis, ...])

Make Data more balanced across entire cluster.