https://github.com/nixtla logo
#general
Title
# general
a

Akash Verma

08/18/2023, 7:42 AM
Hi all I am new to group and hence may ask very silly questions so please bear with me. Below are my questions on which I am seeking help/guidance. 1. How to use an already trained model of statsforecast library (I am currently using Holtwinters) to forecast data for new series. 2. I am currently using joblib package of python to save my trained model @Mariana Menchero any suggestion/help would be highly appreciated
m

Mariana Menchero

08/19/2023, 9:36 PM
Hi @Akash Verma No question is silly and we're happy to help you. 1. Currently you need to train the models from statsforecast with the data that you have. You can use any of the models like you've been doing with Holt Winters. 2. It depends on the size of your data, but the parquet format probably will be a good option. Just use
df.to_parquet(your path). <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_parquet.html|https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_parquet.html>
If you have any more questions, please let me know.
l

Leonie

08/21/2023, 6:41 PM
Regarding 1: @Mariana Menchero thanks for your answer! So right now it is not possible to use statsforecast in production ie train on historical data and then use the fitted model to predict the next day, each day at midnight? Is that possible with mlforecast?
m

Mariana Menchero

08/22/2023, 6:12 PM
Hi @Leonie you can use statsforecast in production. I'm sorry if my previous answer was confusing, but you can indeed train on historical data and then use the fitted model to predict the next h steps ahead. What you can't do is predict for unseen data. mlforecast works the same as statsforecast.
l

Leonie

08/22/2023, 11:24 PM
that sounds great @Mariana Menchero, so if I have a fitted model with
Copy code
models = [AutoARIMA(season_length=24)]
sf = StatsForecast(models=models, freq='H')
sf.fit(df=df_train)
Now I want to test the production case, so I have a test dataframe and at 2 pm everyday the next day should be forecasted based on test data but without re-fitting the model on test data. something like this:
sf.predict(df = df_test, h=horizon)
How does that work?
a

Akash Verma

08/23/2023, 5:45 AM
Hi @Mariana Menchero thanks for your revert, my question was little different btw, I have already trained the model using one of the algorithm(Holtwinters in this case), and have saved that model as well using Joblib library of python, now i want to load this model and start the predictions on a new time-series data, I am currently struggling with this part, so any help in this regard will be highly appreciated.
@Max (Nixtla) need your inputs as well... 🙂
m

Mariana Menchero

08/24/2023, 6:23 AM
I think the main issue is that you can't use a model on previously unseen data. If you fit a model with the airpassengers dataset (this is just an example), then the predict function only works on that dataset. I think I might have misunderstood your previous question, so please let me know if this is what you're asking.
a

Akash Verma

08/24/2023, 6:35 AM
@Mariana Menchero so you mean to say that I should not save my model.. instead of that I should train my model every time and do the predictions for the new series...correct my understanding here... please....
m

Mariana Menchero

08/24/2023, 6:41 AM
yes, saving should only be done if you can't do everything on the same day. But the key idea is that you can only fit and predict on the same dataset. If for some reason your dataset changes (for example, more observations are added), then you need to retrain the model.