Hi team I have encountered this type of situation several ti Nixtla Community #mlforecast

Hi team I have encountered this type of situation...

Naren Castellon

06/02/2024, 3:56 AM

Hi team I have encountered this type of situation several times in some projects that I have carried out, and I make the predictions then represent them visually as shown in the graph. To this result of the predictions I apply a couple of metrics, for example MSE. So far so good, but what happens when I then apply the cross validation and then evaluate the result of cross validation with the metrics, it happens that the best model is not the one I mean when I apply the metrics with the result of the predictions, Why does this discrepancy occur? What can I do to make a decision in these cases? What am I failing? I would expect it to be the same result in the model selection, although numerically the result was different!!! I am using this definition to evaluate cross validation from image

José Morales

06/03/2024, 3:52 PM

You're getting an error of zero with linear regression? I'm pretty sure there's something wrong in your evaluation

Naren Castellon

06/03/2024, 3:55 PM

Hello Jose!!1 Yes the error is zero, but if you see the forecasting graph it makes sense that it is zero

José Morales

06/03/2024, 3:56 PM

Which features are you using? Are you not accidentally passing the target as a feature?

Naren Castellon

06/03/2024, 3:58 PM

mlf = MLForecast(models=models,

freq='MS',

lags=range(1,14,1),

#lag_transforms={1: [expanding_mean],7: [(rolling_mean, 7)] },

target_transforms=[Differences([2]), LocalRobustScaler(scale='iqr')], # LocalRobustScaler(scale='iqr')  Differences([1]), LocalStandardScaler()

date_features=["year", "month", "day"],

num_threads=2

José Morales

06/03/2024, 4:01 PM

I meant exogenous features, the ones you're passing through

X_df

Naren Castellon

06/03/2024, 4:05 PM

When you use the

X_df = test

for the predict, it should not contain the

target = y

variable, only the exogenous variables the

ds

and

unique_id

. You know what's strange about this, it's happened to me on several occasions, and I've decided not to use cross validation on those occasions.

José Morales

06/03/2024, 4:07 PM

The X_df is only used for exogenous features, if you don't have any you don't have to provide it

José Morales

06/03/2024, 4:08 PM

Which settings are you using for cross_validation (window_size, h, step_size)?

Naren Castellon

06/03/2024, 4:12 PM

I am using using exogenous variables. I am using the same parameters that I use in the fit: # fit the models

mlf.fit(train,fitted=True, static_features= [],

prediction_intervals=PredictionIntervals(n_windows=3, h = 24, method="conformal_distribution") )

cross validation

cv_result_ml = mlf.cross_validation(

train,

n_windows = 3, # number of models to train/splits to perform

h = 24, )

José Morales

06/03/2024, 4:13 PM

Do you have exogenous features? If you do you have to set

static_features=[]

in the cross_validation call as well

Naren Castellon

06/03/2024, 4:17 PM

ah okay. So every time exogenous is used, the parameter static_features=[] must be passed. Now if the result makes sense, the forecast. Thanks Jose!!1

👍 1

Naren Castellon

06/03/2024, 10:59 PM

@José Morales, a question, is this also valid for statsforecast and neuralforecast, when I have exogenous variables, when i use the cross validaciones?

José Morales

06/03/2024, 11:00 PM

statsforecast doesn't support static features, so they're all dynamic and you don't have to set anything. for neuralforecast you set them in the model constructors, so if you're setting them as

futr_exog_list

it's ok.

Naren Castellon

06/03/2024, 11:01 PM

thank you so much!!! 💪

Open in Slack

Previous Next