caf.toolkit.pandas_utils#

Sub-package for miscellaneous helper functionality related to the pandas package.

Functions

cast_to_common_type(items_to_cast)

Cast N objects to the same datatype.

chunk_df(df, chunk_size)

Split a dataframe into chunks, usually for multiprocessing.

compare_matrices(matrix_report_a, ...[, ...])

Compare two matrix reports.

compare_matrices_and_output(excel_writer, ...)

Compare two matrix reports.

dataframe_to_n_dimensional_array()

Convert a pandas.DataFrame into an N-Dimensional numpy array.

dataframe_to_n_dimensional_sparse_array(df, ...)

Convert a pandas.DataFrame to a sparse.COO matrix.

filter_df(df, df_filter[, throw_error])

Filter a pandas DataFrame by a filter.

filter_df_mask(df, df_filter)

Generate a mask for filtering a pandas DataFrame by a filter.

get_full_index(dimension_cols)

Create a pandas Index from a mapping of {col_name: col_values}.

get_wide_all_external_mask(df, select)

Generate an external only mask for a wide matrix.

get_wide_internal_only_mask(df, select)

Generate an internal only mask for a wide matrix.

get_wide_mask(df[, select, col_select, ...])

Generate an index/column mask for a wide Pandas matrix.

is_sparse_feasible(df, dimension_cols[, ...])

Check whether a sparse array is more efficient than a dense one.

long_df_to_wide_ndarray(*args, **kwargs)

Convert a DataFrame from long to wide format, infilling missing values.

long_product_infill()

Infill columns with a complete product of one another.

long_to_wide_infill(matrix, *[, infill, ...])

Convert a DataFrame from long to wide format, infilling missing values.

matrix_describe(matrix[, almost_zero])

Create a high level summary of a matrix.

n_dimensional_array_to_dataframe(mat, ...[, ...])

Convert an N-dimensional numpy array to a pandas.Dataframe.

overload(func)

Decorator for overloaded functions/methods.

reindex_and_groupby_sum(df, index_cols, ...)

Reindexes and groups a pandas DataFrame.

reindex_cols(df, columns[, throw_error, ...])

Reindexes a pandas DataFrame.

reindex_rows_and_cols(df, index, columns[, ...])

Reindex a pandas DataFrame, making sure index/col types don't clash.

str_join_cols(df, columns[, separator])

Equivalent to separator.join(columns) for all rows of pandas DataFrame.

wide_matrix_internal_external_report(df, ...)

Generate a matrix report of value totals internal and externally.

wide_to_long_infill(df[, out_name, ...])

Convert a matrix from wide to long format, infilling missing values.

Classes

Any(*args, **kwargs)

Special type indicating an unconstrained type.

ChunkDf(df, chunk_size)

Generator to split a dataframe into chunks.

Hashable()

MatrixReport(matrix, *[, ...])

Creates a high level summary of a matrix and its trip ends.

Modules

df_handling

Helper functions for handling pandas DataFrames.

matrices

Contains functions that perform checks and provide high level statistics.

numpy_conversions

Conversion methods between numpy and pandas formats.

random

Build Dummy datasets for the purposes of testing and demonstrations.

utility

Basic utility functions for pandas objects.

wide_df_handling

Helper functions for handling wide pandas DataFrames, usually as demand matrices.