Hi Team, I have been learning a lot from using th...
# mlforecast
w
Hi Team, I have been learning a lot from using the Mlforecast library. It’s been a great resource so far! I’ve been experimenting with the
cross_validation
function and I had a question about it. I understand that in time series analysis, when we create a validation set, it should only include information that would be available at the time of prediction. This means that lagged features for the validation set should be computed based on data up to the last point in the training set for each window. I was wondering how the
cross_validation
function in Mlforecast handles this. Does it ensure that lagged features for the validation set are only computed based on data up to the last point in the training set for each window? I hope my question makes sense. Any guidance on this would be really helpful. Thank you so much!
j
Hey. The cross_validation method is basically a for loop over the train and validation splits where the train is used to fit, the predictions are done independently and the validation set is only used to include the target after the predictions have been computed to serve as a reference, so the model doesn't ever see the validation set. You can see the training code here and the validation set is added here
w
Thanks for your detailed explanation. I can only understand part of the code in the forecast.py file 🥲, and I wasn’t sure about this by reading the code on my own. Your clarification helps me understand it further and is very helpful! Much appreciated.