Hi team I have encountered this type of situation...
# mlforecast
n
Hi team I have encountered this type of situation several times in some projects that I have carried out, and I make the predictions then represent them visually as shown in the graph. To this result of the predictions I apply a couple of metrics, for example MSE. So far so good, but what happens when I then apply the cross validation and then evaluate the result of cross validation with the metrics, it happens that the best model is not the one I mean when I apply the metrics with the result of the predictions, Why does this discrepancy occur? What can I do to make a decision in these cases? What am I failing? I would expect it to be the same result in the model selection, although numerically the result was different!!! I am using this definition to evaluate cross validation from image
j
You're getting an error of zero with linear regression? I'm pretty sure there's something wrong in your evaluation
n
Hello Jose!!1 Yes the error is zero, but if you see the forecasting graph it makes sense that it is zero
j
Which features are you using? Are you not accidentally passing the target as a feature?
n
mlf = MLForecast(models=models,
freq='MS',
lags=range(1,14,1),
#lag_transforms={1: [expanding_mean],7: [(rolling_mean, 7)] },
target_transforms=[Differences([2]), LocalRobustScaler(scale='iqr')], # LocalRobustScaler(scale='iqr')  Differences([1]), LocalStandardScaler()
date_features=["year", "month", "day"],
num_threads=2
)
j
I meant exogenous features, the ones you're passing through
X_df
n
When you use the
X_df = test
for the predict, it should not contain the
target = y
variable, only the exogenous variables the
ds
and
unique_id
. You know what's strange about this, it's happened to me on several occasions, and I've decided not to use cross validation on those occasions.
j
The X_df is only used for exogenous features, if you don't have any you don't have to provide it
Which settings are you using for cross_validation (window_size, h, step_size)?
n
I am using using exogenous variables. I am using the same parameters that I use in the fit: # fit the models
mlf.fit(train,fitted=True, static_features= [],
prediction_intervals=PredictionIntervals(n_windows=3, h = 24, method="conformal_distribution") )
cross validation
cv_result_ml = mlf.cross_validation(
train,
n_windows = 3, # number of models to train/splits to perform
h = 24, )
j
Do you have exogenous features? If you do you have to set
static_features=[]
in the cross_validation call as well
n
ah okay. So every time exogenous is used, the parameter static_features=[] must be passed. Now if the result makes sense, the forecast. Thanks Jose!!1
👍 1
@José Morales, a question, is this also valid for statsforecast and neuralforecast, when I have exogenous variables, when i use the cross validaciones?
j
statsforecast doesn't support static features, so they're all dynamic and you don't have to set anything. for neuralforecast you set them in the model constructors, so if you're setting them as
futr_exog_list
it's ok.
n
thank you so much!!! 💪