hi team, is there a way to have bigger validation ...
# neural-forecast
j
hi team, is there a way to have bigger validation subset for the validation loss function to be calculated on. Lets say my horizon is 28 days, is there a possibility to have multiple times in history to have the validation loss be trained on all of horizon length of 28. So lets say 1 valid loss subset is calculated from the present day-28 days in the past. 1 valid loss subset is yesterday-28 days in the past and so on untill you have more valid loss datasets?
s
@Ask Inkeep Please help to answer the original question by @Jelte Bottema
a
Thanks for tagging me @Saúl Caballero. I'll look through relevant sources and get back to you shortly.
Based on the documentation, I need to clarify which specific ​*NeuralForecast*​ library component you're referring to. The available sources show several AutoModel classes that handle hyperparameter optimization with validation subsets (1) (2) (3) . Could you clarify which specific model or approach you're using? Are you working with: 1. One of the AutoModel classes (like
AutoRNN
,
AutoLSTM
,
AutoNBEATS
, etc.) 2. Regular NeuralForecast models with cross-validation (4) 3. A custom training setup The AutoModel classes use Ray Tune or Optuna for hyperparameter optimization and have parameters like
num_samples
for optimization steps (1) (2) (3) . For cross-validation approaches, NeuralForecast has a
cross_validation
method that can create multiple validation windows (4) . Once I understand your specific setup, I can provide more targeted guidance on expanding your validation approach for your 28-day horizon scenario.
Appreciate the feedback! Feel free to ask me again with additional guidance or context.
o
@Jelte Bottema I don't think the bot gave a meaningful answer; I'm not sure what you mean. Can you give a small code example? I don't think NF can do what you want, but that's from my (very limited) understanding of your question.
j
Hi, thanks for the reply, I am using an AUTO-NHITS model. Basically I think my question boils down to if the valid_size parameter works like I want it to. So you in the documentation I find the short description: val_size : int, validation size for temporal cross-validation. does this mean if I increase val_size I increase the data for the validation loss function to be trained on. So How I understand it you have loss function for training the model which is validated to a validation subset of the data(generally the last horizon before the present day) this gives your validation loss metric. Can I expand the validation data in a certain way? Does this make more sense?
o
Yes increasing val_size increases the size of the validation set, at the expense of fewer training data points
j
then my question is how is the validation set expanded? with the example that if my forecast horizon is 24 days, and the val size is 48. Is the val size split up into two horizons of 24 and the loss is calculated, or how does it work?
o
val_size should be at least the size of the horizon. Beyond that it constructs the maximum number of possible windows, for a val_size of 48 and a horizon of 24, you'll have 25 possible evaluation windows.
j
Great! that was indeed my question, the valid_loss that is outputted is then the average over 25 possible evaluations or the sum?
o
That depends on the valid_loss function you use, if it's MAE, it's the M(ean) Absolute Error. Mean indicates average.