evaluate_feature_set_relationships#
- mlproject.corr_analysis.dependency_graph.evaluate_feature_set_relationships(X_lob, X_matminer, y, model=None, n_splits=5, random_state=42, n_jobs=-1, scoring=None)[source]#
Computes cross-validated regression metrics (mean ± std) between two feature sets and a target.
- Evaluates four relationships:
X_lob → y
X_matminer → y
X_lob → X_matminer
X_matminer → X_lob
- Parameters:
X_lob (pd.DataFrame or np.ndarray) – Feature matrix for the first feature group (e.g. Lobster).
X_matminer (pd.DataFrame or np.ndarray) – Feature matrix for the second feature group (e.g. Matminer).
y (pd.Series or np.ndarray) – Target variable.
model (estimator, optional) – Base regressor (default: RandomForestRegressor).
n_splits (int, optional) – Number of CV splits (default: 5).
random_state (int, optional) – Random seed (default: 42).
n_jobs (int, optional) – Number of parallel jobs (default: -1).
scoring (dict, optional) – Dict of scoring functions (name -> scorer). Default: R², MAE, RMSE, MAPE.
- Returns:
Summary DataFrame with mean and std for each metric and relationship.
- Return type:
pd.DataFrame