I dont understand the argument new_data in predict, if it is given, does that mean that the model is again trained on that new_data and then predict the values regarding to the horizon? For example I want to use an one hour prediction on a train data that has 5 min frequency, so I am using multi-step forecast. My end of train data is example 31-12-2022 10:00, so when I use predict with using the new_data containing values of 01-01-2023 00:00 and 01-01-2023 00:05, the predictions will be from 01-01-2023 00:10 and one hour further 01-01-2023 01:10. I am only interested for the value at 01-01-2023 01:10 so before values would be discarded. But also I want to know the value at 01-01-2023 01:15 so now in my new_data I add values for 01-01-2023 00:00, 01-01-2023 00:05, 01-01-2023 00:10 and I get predictions to 01-01-2023 01-15, does increasing the size of new_data influence the quality of the predictions?
Can someone explain it to me please?
And am I allowed to add train samples to new_data or the results would be overfitte
10/18/2023, 4:30 PM
Hey. Sorry for the late reply. The contents of new_data are used as if they were the inputs of your original training set, thus the features you defined are computed and the already trained model is used to predict them. So no retraining is performed, it's meant to be used in a transfer learning scenario