I'm experiencing an odd thing with statsforecast t...
# statsforecast
I'm experiencing an odd thing with statsforecast that I haven't been able to figure out yet. I can use a horizon of 1 to 11 and things look great. When I try to go above H=11 the forecasts become uniform (all the same value) for all of the unique_ids in the dataframe and all of the algos I'm trying. It happens with and without exogenous variables. They look like the following. Any ideas of what might cause this?
Hey. Which model are you using?
Hey @José Morales, I'm currently using: AutoARIMA, AutoETS, AutoCES, AutoTheta, DynamicOptimizedTheta, Holt, HoltWinters
Do all the models yield the constant prediction?
I'm 99% sure they do. My script right now is selecting the best model based on MAE. Among the "best models" they seem to all have constant predictions.
I had just restarted a run with h=11 before you responded so I'm waiting on it to finish so I can go back and look.
@José Morales confirmed by doing h=6, h=11, and h=12. First two I get non-constant predictions. But, for h=12 I get constant predictions for all of the unique_ids. This is a random sample of 5% of my overall unique ids and results for the best model based on MAE, but I wouldn't expect those to change the results. Any ideas?
How are you computing the MAE? Is it possible that for h=12 the best model is the constant one?
It'd be great if you could save the individual forecasts before selecting the best one so that we at least know where they're coming from (which model)
Hi @Brian Head, this could be an expected behavior. Think of it like this, the further ahead you want to predict the future, the better a 'straight' constant line might satisfy the objective of minimizing a a specific error metric, in this case MAE. In other words, sometimes the best some models can do based on the data provided is a straight line. In the specific case of ETS, sometimes, without seasonality restrictions, it oddly opts for a straight-line forecast. We've put together a Google Colab tutorial (https://colab.research.google.com/drive/1mKYY4elPDjm4XRjyKxk9ZnWmYZQyBRLE#scrollTo=HRTvP8UdS8Wo) showing how we can guide it better by restricting seasonality. It should make a noticeable difference.
intervals = ConformalIntervals(h=toggle, n_windows=5)
SF_models = [
# AutoARIMA(prediction_intervals=intervals, alias="ARIMA"), #arima works for in sample
AutoETS(prediction_intervals=intervals, season_length=12, alias="NonSeasonalAutoETS"), #ets works for in sample
AutoETS(prediction_intervals=intervals, season_length=12, model='ZZA', alias="SeasonalAutoETS"), #ets works for in sample
# AutoCES(prediction_intervals=intervals), #CES works for in sample
# AutoTheta(prediction_intervals=intervals, alias="Theta"), #Theta1 works for in sample;  decomposition_type='additive',
# DynamicOptimizedTheta(prediction_intervals=intervals, season_length=13),
# Holt(prediction_intervals=intervals, season_length=13),
# HoltWinters(prediction_intervals=intervals, season_length=13)
So the training set is the same and the only thing that changes is the forecast horizon?
Is it possible for you to provide a sample serie that reproduces the issue?
I wish I could, but not allowed to share any data for security reasons.
Yeah I mean if you're able to reproduce the issue with some simulated data
I'm trying this but it works fine:
Copy code
from statsforecast import StatsForecast
from statsforecast.models import AutoETS
from statsforecast.utils import ConformalIntervals, generate_series

series = generate_series(1, freq='M')
intervals = ConformalIntervals(h=12, n_windows=5)
sf = StatsForecast(
        AutoETS(prediction_intervals=intervals, season_length=12, alias="NonSeasonalAutoETS")
sf.forecast(df=series, h=12)
Let me see what I can do to create a synthentic/simulated version.
Also, which method are you using? forecast, fit+predict, etc?