get_wide_mask#
- caf.toolkit.pandas_utils.get_wide_mask(df, select=None, col_select=None, index_select=None, join_fn=<built-in function and_>)[source]#
Generate an index/column mask for a wide Pandas matrix.
Helper function to make selecting combinations of zones in a wide matrix easier. The index and column selections can be set individually using col_select and index_select, or set to the same value using selection.
- Parameters:
df (DataFrame) – The dataframe to generate the mask for.
select (Collection[Any] | None) – The IDs to select in both the columns and index. If this value is set it will overwrite anything passed into col_select and index_select.
col_select (Collection[Any] | None) – The IDs to select in the columns. This value is ignored if selection is set.
index_select (Collection[Any] | None) – The IDs to select in the index. This value is ignored if selection is set.
join_fn (Callable) – Individual masks are generated for the index and columns. This function is used to combine the two masks. By default a bitwise AND is used, meaning the final mask will only return True where both the index and column masks overlap. See pythons builtin operator library for more built-in options. Custom functions can be given. They must accept two numpy arrays as input and return one as output.
- Returns:
A mask of True and False values. Will be the same shape as df.
- Return type:
np.ndarray
Examples
Typical usage for travel demand matrices
>>> df = pd.DataFrame(np.arange(16).reshape(4, 4)) >>> df 0 1 2 3 0 0 1 2 3 1 4 5 6 7 2 8 9 10 11 3 12 13 14 15
>>> get_wide_mask(df,select=[0, 1]) array([[ True, True, False, False], [ True, True, False, False], [False, False, False, False], [False, False, False, False]])
It’s possible to select differently for the index and columns
>>> get_wide_mask(df,col_select=[0, 1],index_select=[1, 2, 3]) array([[False, False, False, False], [ True, True, False, False], [ True, True, False, False], [ True, True, False, False]])
The operator for joining the column and index selections can also be changed
>>> get_wide_mask(df,select=[0, 1],join_fn=operator.or_) array([[ True, True, True, True], [ True, True, True, True], [ True, True, False, False], [ True, True, False, False]])