get_rf_pfi_shap_summary

get_rf_pfi_shap_summary#

mlproject.postprocess.feature_importances.get_rf_pfi_shap_summary(models_parent_dir, data_parent_dir, target_name, n_repeats=5, random_state=42)[source]#

Load trained RandomForest models (5 folds), compute PFI + SHAP values

Parameters:
  • models_parent_dir (str) – Directory containing rf_*_pipeline.pkl files for each fold.

  • target_name (str) – Target variable name (used in file path pattern).

  • n_repeats (int, optional) – Number of PFI repeats (default=5).

  • random_state (int, optional) – Random seed for reproducibility.

  • n_feats (int, optional) – Number of top features to display.

  • data_parent_dir (str)

Returns:

  • pfi_df (pd.DataFrame) – Combined permutation feature importance results.

  • shap_df (pd.DataFrame) – Combined mean absolute SHAP values across folds.

Return type:

tuple[DataFrame, DataFrame]