pandas_long_matrix_zone_translation#

caf.toolkit.translation.pandas_long_matrix_zone_translation(matrix, index_col_1_name, index_col_2_name, values_col, translation, translation_from_col, translation_to_col, translation_factors_col, col_translation=None, translation_dtype=None, index_col_1_out_name=None, index_col_2_out_name=None, check_totals=True)[source]#

Efficiently translates a pandas matrix between index systems.

Parameters:
  • matrix (DataFrame | Series) – The matrix to translate, in long format. Must contain columns: [index_col_1_name, index_col_2_name, value_col].

  • index_col_1_name (str) – The name of the first column in matrix to translate index system.

  • index_col_2_name (str) – The name of the second column in matrix to translate index system.

  • values_col (str) – The name of the column in matrix detailing the values to translate.

  • translation (DataFrame) – A pandas DataFrame defining the weights to use when translating. Needs to contain columns: translation_from_col, translation_to_col, translation_factors_col. When col_translation is None, this defines the translation to use for both the rows and columns. When col_translation is set, this defines the translation to use for the rows.

  • col_translation (DataFrame | None) – A matrix defining the weights to use to translate the columns. Takes an input of the same format as translation. When None, translation is used as the column translation.

  • translation_from_col (str) – The name of the column in translation and col_translation containing the current index and column values of matrix.

  • translation_to_col (str) – The name of the column in translation and col_translation containing the desired output index and column values. This will define the output index and column format.

  • translation_factors_col (str) – The name of the column in translation and col_translation containing the translation weights between translation_from_col and translation_to_col. Where zone pairs do not exist, they will be infilled with translate_infill.

  • translation_dtype (dtype | None) – The numpy datatype to use to do the translation. If None, then the dtype of vector is used. Where such high precision isn’t needed, a more memory and time efficient data type can be used.

  • check_totals (bool) – Whether to check that the input and output matrices sum to the same total.

  • index_col_1_out_name (str | None) – The name to give to index_col_1_name on return.

  • index_col_2_out_name (str | None) – The name to give to index_col_2_name on return.

Returns:

matrix, translated into to_unique_index system.

Return type:

translated_matrix

Raises:

ValueError: – If matrix is not a square array, or if translation any inputs are not the correct format.