Hi, I'm trying to compute SHAP values for feature ...
# mlforecast
b
Hi, I'm trying to compute SHAP values for feature importance in MLForecast but getting a ValueError about categorical features not matching between train and valid datasets. I'm following the SHAP example from the docs. I'm getting the error "ValueError: train and valid dataset categorical_feature do not match" in MLForecast with LightGBM, even though I'm using only one dataset (df_encoded).
Copy code
auto_mlf = AutoMLForecast(
    freq="ME",
    season_length=12,
    models={
        'lgb': AutoLightGBM() 
    },
     fit_config=lambda trial: {'static_features': ['unique_id']}  
)
auto_mlf.fit(
    df=df_encoded,
    n_windows=n_windows,
    h=h,
    step_size=step_size,
    fitted=True,
    num_samples=40,
    loss=loss_fn
)
config = auto_mlf.results_['lgb'].best_trial.user_attrs['config']

fcst = MLForecast(
    models=[LGBMRegressor(**config['model_params'])],
    freq="ME",
    **config['mlf_init_params']
)

cv_result2 = fcst.cross_validation(
    df_encoded,
    n_windows=n_windows,  # number of windows
    h=h, 
    step_size=step_size,
    static_features= ['unique_id']
) prep = fcst.preprocess(df_encoded, static_features=['unique_id'])
X = prep.drop(columns=['unique_id', 'ds', 'y'])
fcst.fit(df_encoded, static_features=['unique_id']) explainer = shap.Explainer(fcst.models_["LGBMRegressor"].predict, X)
shap_values = explainer(X)