Quick question regarding validation, I dont want t...
# neural-forecast
l
Quick question regarding validation, I dont want to use cross-validation, but just train on 8 months and validate on one month. Is this possible? Can I insert my validation dataframe to fit? Something like this:
nf.fit(df_train, df_val)
c
Hi @Leonie. Yes, it is possible to train on 8 months and validate on one month, you can control that with the
val_size
parameter of the
fit
method.
Currently it does not support passing a different validation dataframe, as it assumes that the validation set is a temporal split of the training dataframe (this is the most common approach). What is your use case? do you want to validate on a different set of time series than the training dataset?
l
thanks for your answer! I want to test how my model would perform in production ie I want to train my model on test data until a certain point in time and then I want to apply my model on unseen data and predict the next day, each day at 2pm the day bevore (predict the next 33 hours) • Jan - Oct: Training on this data • Nov: Tune hyperparams on this data • Dec: Test tuned model on this data Is this possible? Thanks so much for your help!
c
Yes! This is precisely the use case of the predefined methods of the
core
function. In particular, use the
cross_validation
method with the desired
val_size
and
test_size
. Models will be trained on the train set, using validation set for early stopping and model selection. You can use one of our
auto
models that will automatically perform hyperparameter selection on this validation set. The
cross_validation
method will return the forecasts for the test set so you can evaluate/plot performance afterwards
l
Do you have any tutorials that describe that?
c
You will need to be careful on how to define the test set. You have a horizon of 33, but are producing forecasts every 24 hours, so
step_size
should be 24. Something like this should work:
Copy code
cross_validation(df=train_data,
                 val_size=24*31,  # october
                 n_windows=31, # Number of forecasts of size 33 (31 days for december)
                 step_size=24) # You are producing forecasts every day, so step size between forecasts is 24
You need to be careful to end your data the last hour of December, so that forecasts are produced exactly at 2pm the day before. You can check the
cutoff
column in the dataframe returned by
cross_validation
to check if forecasts are produced at 2pm.
Alternatively, you can do this process manually by calling the
fit
function once, and then define a for loop of
predict
functions, adding more data each time to produce the next forecast.
l
Ah @Cristian (Nixtla) I start to understand, this is amazing! Thank you very much! The only thing that it is missing right now, can I not limit the number of epochs for AutoLSTM, because it is fitting the 700th epoch but I dont need that many
c
Use the
max_steps
hyperparameter to limit training