This message was deleted Nixtla Community #general

Join Slack

This message was deleted.

# general

Slackbot

11/14/2022, 9:47 PM

This message was deleted.

👀 1

Max (Nixtla)

11/15/2022, 3:56 AM

If I understand the question correctly, the second option is more efficient.

Matias Calderini

11/15/2022, 2:58 PM

I'll give it a try and come back if something seems off. Thanks for the quick reply!

Matias Calderini

11/15/2022, 3:39 PM

Hello again. So I have a bit of an issue with using both the .forecast and .cross_validation methods in testing. In both cases, it seems to be fitting the model to the new data, which during testing might not be desirable, nor efficient. Is there a way to force it to use the trained parameters instead of retraining? Am I missing something in my approach?

Max (Nixtla)

11/15/2022, 3:49 PM

Is this what you are doing?

Max (Nixtla)

11/15/2022, 3:50 PM

And do you want to do this?

Max (Nixtla)

11/15/2022, 3:50 PM

(Credit to joaquinAmatRodrigo for the second image)

Max (Nixtla)

11/15/2022, 4:00 PM

Currently, the cross-validation class does the first thing. If I understand correctly, you want to train a model on a subset of the data and forecast the future without retraining on newly available data. Maybe to save some computing resources and maybe because the model’s accuracy will not decrease so much. Right? If that’s the case, that functionality is currently not available. But we are working on it. However, in our experience, it makes sense to retrain “every time” you have new relevant data. Since statsforecast is highly efficient, even thousands of series take a couple of minutes. If you are using some of the Auto models and want to improve the training speed further, you can restrict the number of models. For example,

ETS (season_length:int=1, model:str='ANN')

will explore only a simple exponential smoothing.

Max (Nixtla)

11/15/2022, 4:02 PM

Please also check this issue and like it if is are relevant to your use case: https://github.com/Nixtla/statsforecast/issues/287.

Matias Calderini

11/15/2022, 4:42 PM

That's exactly it. So cv is doing like the first image, but with overlapping windows (unless one specifies step_size=horizon, then we get exactly the first image), right? With the overlapping windows then, if I had a test et of 100 points, it would effectively train 100 new models. This test would give me an estimate of what my model will be like in production if I retrain it every single day. Is that accurate?

Matias Calderini

11/15/2022, 4:44 PM

I'm glad to hear you are working on the backtesting without refit. It would allow one to also measure how often the models need to be retrained or if the trained model is relatively stable through time. I'll be keeping an eye out for that functionality.

Matias Calderini

11/15/2022, 4:46 PM

Thank you also for the tip on making the training more efficient. Is there anything else I should keep in mind for speed-up if I start doing a big number of refits. currently I only use num_jobs=-1. Should I think of installing any external libraries, do I need to setup use of GPU with neuralforecats models or is this all out of the box already?

Max (Nixtla)

11/15/2022, 5:17 PM

That’s exactly it. So cv is doing like the first image, but with overlapping windows (unless one specifies step_size=horizon, then we get exactly the first image), right? With the overlapping windows then, if I had a test et of 100 points, it would effectively train 100 new models. This test would give me an estimate of what my model will be like in production if I retrain it every single day. Is that accurate? (edited)

Yes!

Matias Calderini

11/15/2022, 5:29 PM

Great. Thank you so much for the help!

🙌 1

Open in Slack

Previous Next