get_wide_mask#

caf.toolkit.pandas_utils.get_wide_mask(df, select=None, col_select=None, index_select=None, join_fn=<built-in function and_>)[source]#

Generate an index/column mask for a wide Pandas matrix.

Helper function to make selecting combinations of zones in a wide matrix easier. The index and column selections can be set individually using col_select and index_select, or set to the same value using selection.

Parameters:
  • df (DataFrame) – The dataframe to generate the mask for.

  • select (Collection[Any] | None) – The IDs to select in both the columns and index. If this value is set it will overwrite anything passed into col_select and index_select.

  • col_select (Collection[Any] | None) – The IDs to select in the columns. This value is ignored if selection is set.

  • index_select (Collection[Any] | None) – The IDs to select in the index. This value is ignored if selection is set.

  • join_fn (Callable) – Individual masks are generated for the index and columns. This function is used to combine the two masks. By default a bitwise AND is used, meaning the final mask will only return True where both the index and column masks overlap. See pythons builtin operator library for more built-in options. Custom functions can be given. They must accept two numpy arrays as input and return one as output.

Returns:

A mask of True and False values. Will be the same shape as df.

Return type:

np.ndarray

Examples

Typical usage for travel demand matrices

>>> df = pd.DataFrame(np.arange(16).reshape(4, 4))
>>> df
    0   1   2   3
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15
>>> get_wide_mask(df,select=[0, 1])
array([[ True,  True, False, False],
       [ True,  True, False, False],
       [False, False, False, False],
       [False, False, False, False]])

It’s possible to select differently for the index and columns

>>> get_wide_mask(df,col_select=[0, 1],index_select=[1, 2, 3])
array([[False, False, False, False],
       [ True,  True, False, False],
       [ True,  True, False, False],
       [ True,  True, False, False]])

The operator for joining the column and index selections can also be changed

>>> get_wide_mask(df,select=[0, 1],join_fn=operator.or_)
array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True, False, False],
       [ True,  True, False, False]])