Hey Just want to show some progress on including Shap values Nixtla Community #squads

Hey! Just want to show some progress on including ...

Marco

06/02/2025, 4:48 PM

Hey! Just want to show some progress on including Shap values in neuralforecast (I finally found some time for that!). I think it will be useful to provide explainability for TimeGPT that doesn't rely on LightGBM. Right now, this works:

Copy code

mlp = MLP(
    h=12,
    input_size=24,
    max_steps=200,
    alias="MLP"
)
nf = NeuralForecast(models=[mlp], freq="ME")
nf.fit(df=Y_train_df)
explanation = nf.explain(background_size=50, target_samples=5)

it returns a dict with all the values necessary for SHAP to make plots and interpret:

Copy code

{'MLP': {'shap_values': array([[[-6.10306270e+01, -9.39689115e+00,  1.97515033e+01,
            1.17820769e+01,  1.51813563e+01,  2.06414518e+01,
            1.89970556e+01,  1.91416997e+01,  2.06474227e+01,
           -5.95401884e+00, -4.46718962e+00, -2.44429846e+01],
          ...
  'feature_names': ['y_lag_1',
   'y_lag_2',
   'y_lag_3',
   'y_lag_4',
   'y_lag_5',
   'y_lag_6',
   'y_lag_7',
   'y_lag_8',
   'y_lag_9',
   'y_lag_10',
   'y_lag_11',
   'y_lag_12',
   'y_lag_13',
   'y_lag_14',
   'y_lag_15',
   'y_lag_16',
   'y_lag_17',
   'y_lag_18',
   'y_lag_19',
   'y_lag_20',
   'y_lag_21',
   'y_lag_22',
   'y_lag_23',
   'y_lag_24'],
  'background_data': array([[640., 618., 662., 648., 663., 735., 791., 805., 704., 659., 610.,
          637., 660., 642., 706., 696., 720., 772., 848., 859., 763., 707.,
          662., 705.]], dtype=float32),
  'target_data': array([[340., 318., 362., 348., 363., 435., 491., 505., 404., 359., 310.,
          337., 360., 342., 406., 396., 420., 472., 548., 559., 463., 407.,
          362., 405.]], dtype=float32),
  'base_values': array([732.64758301, 707.97583008, 755.71826172, 754.11199951,
         787.41577148, 853.30413818, 948.47595215, 924.73815918,
         838.97302246, 759.78320312, 718.87841797, 737.99725342]),
  'model_name': 'MLP',
  'model_alias': 'MLP'}}

So then, we can make plots like the one attached. It's still in progress. Next step is to also include exogenous features, not just past lags, but just wanted to get early feedback, in case you don't think it's worth continuing working on it

🙌 5

Han Wang

06/02/2025, 5:20 PM

but we are still using lgbm to handle exog vars right?

Olivier Sprangers

06/02/2025, 5:21 PM

This has nothing to do with TimeGPT @Han Wang, it's a feature for NeuralForecast

Han Wang

06/02/2025, 5:23 PM

i see, and that is why we can't use that on exog variables and those features are predetermined?

Olivier Sprangers

06/02/2025, 5:26 PM

I have no clue what that means. Shapley values provide a way of explaining predictions. It works by attributing a score to each model input feature. A feature to a model is any model input.

Marco

06/02/2025, 6:00 PM

I decided to go with shap directly instead of Captum. It felt like Captum was wrapping around shap anyway, so I didn't want to abstract too much. Also, I didn't see yet how layer or neuron attribution would benefit users/clients (but maybe I'm wrong). So for now, plain SHAP

Olivier Sprangers

06/02/2025, 6:01 PM

Ah ok! Which explainer did you use?

Marco

06/02/2025, 6:06 PM

Kernel, but I think we can easily add others as well

Han Wang

06/02/2025, 6:20 PM

I remember kernel shap is slow, is there any concern on speed?

Nikhil Gupta

06/02/2025, 6:22 PM

Just curious, Would there be a way to find the combined explainability of features, e.g. I want to group all lag features into a category called

autoreg

, group price and discount related features as a category called

price_related

, etc. and find the importance/contribution of each category?

Han Wang

06/02/2025, 6:23 PM

shap values are additive, right?

✅ 1

Marco

06/02/2025, 6:28 PM

@Han Wang speed also depends on how much samples we use to calculate the Shap values. But yes, speed could be an issue and other explainers can be faster. @Nikhil Gupta, I guess you could, since they are additive, but for now, it would be a manipulation to be done once the results are returned

Olivier Sprangers

06/02/2025, 6:34 PM

@Nikhil Gupta We should probably group (ie. just add them together) all features over the temporal dimension anyways, as Shap doesn't understand auto-correlation and generally suffers when features are highly correlated. So Shap results across timesteps is more or less meaningless due to the high correlation.

Olivier Sprangers

06/02/2025, 6:36 PM

But maybe there are already methods tackling this, not too up-to-date on the latest explainability methods, especially in the time domain

Nikhil Gupta

06/02/2025, 6:44 PM

Understood. Yes, it would be applied at each future time step but by clubbing (adding) the Shap values. Thanks!

Han Wang

06/02/2025, 6:46 PM

so i think shap is the bridge between LTM and LLM, we could just provide the ungrouped shap value with meaningful feature descriptions and tell LLM that the values are additive, and let the LLM to decide how to interpret them

3 Views

Open in Slack

Previous Next