caf.toolkit.pandas_utils.df_handling#

Helper functions for handling pandas DataFrames.

Functions

chunk_df(df, chunk_size)

Split a dataframe into chunks, usually for multiprocessing.

filter_df(df, df_filter[, throw_error])

Filter a pandas DataFrame by a filter.

filter_df_mask(df, df_filter)

Generate a mask for filtering a pandas DataFrame by a filter.

get_full_index(dimension_cols)

Create a pandas Index from a mapping of {col_name: col_values}.

long_df_to_wide_ndarray(*args, **kwargs)

Convert a DataFrame from long to wide format, infilling missing values.

long_product_infill()

Infill columns with a complete product of one another.

long_to_wide_infill(matrix, *[, infill, ...])

Convert a DataFrame from long to wide format, infilling missing values.

overload(func)

Decorator for overloaded functions/methods.

reindex_and_groupby_sum(df, index_cols, ...)

Reindexes and groups a pandas DataFrame.

reindex_cols(df, columns[, throw_error, ...])

Reindexes a pandas DataFrame.

reindex_rows_and_cols(df, index, columns[, ...])

Reindex a pandas DataFrame, making sure index/col types don't clash.

str_join_cols(df, columns[, separator])

Equivalent to separator.join(columns) for all rows of pandas DataFrame.

wide_to_long_infill(df[, out_name, ...])

Convert a matrix from wide to long format, infilling missing values.

Classes

Any(*args, **kwargs)

Special type indicating an unconstrained type.

ChunkDf(df, chunk_size)

Generator to split a dataframe into chunks.

Hashable()