Has anyone tried exogenous features with AutoCES? ...
# statsforecast
f
Has anyone tried exogenous features with AutoCES? When I tried exogenous features with Arima I clearly saw the forecast results change but Arima is extremely slow so I don't want to use it. But now adding exogenous features to AutoCES does not make any difference in the result. My exogenous feature is holidays. If a week has holiday in it (e.g., Thanksgiving) it is marked as 1 otherwise 0. My code is like this:
Copy code
models = [
        AutoCES(model='S', season_length=52)
    ]
sf = StatsForecast(
        df=df_train,
        models=models,
        freq='W',
        fallback_model=SeasonalNaive(season_length=52)
    )
frcst_df = sf.forecast(h=52, level=[95], X_df=df_test)
j
Hey. CES doesn't support exogenous features
👍 1
t
@Farzad E you can test out the new automfles method, it is much faster than arima and does support exogenous. The exogenous piece does need more testing though so just be aware of that!
👍 1
f
@Tyler Blume Thanks, I will.
@José Morales Thanks. I am going to try mfles as Tyler suggested. But are there other methods that are as accurate as CES, faster than arima and also supporting exogenous features?
j
The only models that supports exogenous are the arima and mfles, also mstl if you use any of those as trend forecasters
👍 1
f
@Tyler Blume I am not seeing any differences in the forecasts as I change my exogenous feature (a binary vector representing weeks with holidays). Just want to make sure I am using automfles correctly. This is how I am setting it up:
Copy code
models = [
    AutoMFLES(
     season_length = [52],
     test_size = 52,
     metric = 'smape')
]
sf = StatsForecast(
        df=df_train,
        models=models,
        freq='W'
    )
frcst_df = sf.forecast(h=52, X_df=df_test)
Is there something else in mfles settings that is really critical for prediction with exogenous features?
t
@José Morales . I can also trouble shoot later today!
g
@Tyler Blume @Farzad E @José Morales Exactly the same for me. I am experiencing an error when I add exogenous features with auto mfles
t
@Farzad E yeah it looks like doing it with the statsforecast class does not apply exogenous correctly, however using the actual AutoMFLES class DOES work. Give something like this a shot for testing.
import pandas as pd
import numpy as np
from statsforecast.models import AutoMFLES
import matplotlib.pyplot as plt
df = pd.read_csv(r'<https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv>')
y = df['Passengers'].values # make array
X = np.zeros((len(y), 1))
X[20] = 1
y[20] = y[20] + 500
X_future = np.zeros((12, 1))
X_future[8] = 1
mfles_model = AutoMFLES(
season_length = [12],
test_size = 12,
metric = 'smape',
)
mfles_model.fit(y=y, X=X)
predicted = mfles_model.predict(12, X=X_future)['mean']
fitted = mfles_model.predict_in_sample()['fitted']
@José Morales This does seem like a bug!
@Guillaume GALIE can you give us any code or data to reproduce? This error is actually connected with the changepoint LASSO we utilize for trends!
g
I found my mistake => I had one exog feature that was not fully assign with a value => I had one empty cell (Nan) When I fillnan by 0 then auto mfles doesn't not send any error and give me an impressive result => congrats @Tyler Blume My code : model_sf = AutoMFLES(season_length=[52],test_size=26,n_windows=2,metric='mae', config=config_mfles, alias='AutoMFLES') sffcst = StatsForecast(models = [model_sf], freq = 'W-MON', n_jobs=-1,verbose=True) df_forecast_sf = sffcst.forecast(df=df_histo, h=130, fitted=True, X_df=X_df_calendar ).reset_index()
t
@Guillaume GALIE great glad it is no longer erroring! It does look like that is the same pattern that currently is bugged for exogenous. I don't think it is actually using the exogenous piece correctly, do the forecasts look differently with and without the exogenous?
g
for my example. exog feature doesn't bring much more information. so forecast is quite the same but numbers are slight different. I can try to force some peaks on specific dates to check if exog features are correctly taken into account
j
Seems to be working fine:
Copy code
import os
os.environ['NIXTLA_ID_AS_COL'] = '1'

from mlforecast.utils import generate_series, generate_prices_for_series
from statsforecast import StatsForecast
from statsforecast.models import AutoMFLES

series = generate_series(5, equal_ends=True)
prices = generate_prices_for_series(series)
series_wp = series.merge(prices, on=['unique_id', 'ds'])
X_df = prices[prices['ds'].gt(series['ds'].max())]

sf = StatsForecast(
    models=[AutoMFLES(test_size=7, season_length=7)],
    freq='D'
)
print('Forecast no exog: ', sf.forecast(df=series, h=7)['AutoMFLES'].mean())
print('Forecast exog: ', sf.forecast(df=series_wp, h=7, X_df=X_df)['AutoMFLES'].mean())
sf.fit(df=series)
print('Fit predict no exog: ', sf.predict(h=7)['AutoMFLES'].mean())
sf.fit(df=series_wp)
print('Fit predict exog: ', sf.predict(h=7, X_df=X_df)['AutoMFLES'].mean())
g
I confirm it works also for me. Peaks are correctly reproduced on the future linked to my exog feature
t
Ok awesome, I should never trouble shoot before coffee!!
🤣 1
f
@Guillaume GALIE I didn't get any errors and I didn't have any NaNs. But the model still predicted the peak of demand on the wrong week (on the week before Thanksgiving instead of the week of) and even after I changed my exogenous feature and put the holiday feature on different weeks it still didn't make any change in the prediction and showed no sensitivity to which week was marked as a holiday. @Tyler Blume Thanks a lot! I'll give that a try.
t
@Farzad E you could also try passing season_length=None and see how that looks, it will just be a linear trend, smoother, and your exogenous. That should be a little clearer if it is utilizing exogenous in the prediction since it would only be a line with changes accordance to your features.
❤️ 1
g
@Farzad E I had the same issue as you (one week earlier) but actually it was a mistake from my X_df dataframe with wrong dates (I had one duplicated line) To find my issue I tried to run the same example with a simple xgboost and it gave me the information that X_df was not correct Once I corrected my X_DF, peaks are attached to correct thanksgiving weeks Could you please check if your X_df is correct and is not missing some dates? you can run this useful function by the way to compare expected dates https://nixtlaverse.nixtla.io/mlforecast/forecast.html#mlforecast-make-future-dataframe