Hello Team I am testing several models, the AutoA...
# statsforecast
n
Hello Team I am testing several models, the AutoArima and now including the Sklearn models with SklearnModel in StatsForecast... It turns out that I have the same problem that I had before, and that is that when I make the forecast and visualize the worst Model is the AutoArima, when I use the Cross validation, it turns out that the best model is AutoArima.
models = [AutoARIMA(season_length=season_length),
SeasonalNaive(season_length=season_length),
SklearnModel(Lasso()),
SklearnModel(Ridge()),
SklearnModel(RandomForestRegressor())
]
# Forecast
preds = sf.forecast(
df = train,
h = 120,
X_df = test, # Exogenous variables
prediction_intervals = ConformalIntervals(n_windows = 5, h = 120),
level = [95],
)
# Cross Validation
sf.cross_validation(df = train, h = 120, n_windows = 5)
j
Hey. Does the same also happen if you remove the exogenous features?
n
The same problem still occurs, when using cross validation, model selection selects the worst model as the best, and this happens for
Mlforecast
and
statsforecast
, for univariate model and for multivariate
Because it happens: If I am training an XGBoost vs Lineargression, if I train the model without adding parameters to the linear model that was a garbage model, becomes almost as good as the XGBoost model, why does that happen?
j
How are you choosing the best model in cross validation?