Hey team, I have a question regarding the `.update...
# mlforecast
a
Hey team, I have a question regarding the
.update
functionality and the
new_df
parameter of the
.predict
- what is the difference between these to? I have a mlforecast model that I'm saving, I then later want to load it and make predictions using the newest data I have available.
j
Hey. • the new_df of predict can take different series than the ones used for training, applies all the processing and predicts, but doesn't save anything. The goal is transfer learning. • the update method needs to take the same series and modifies the original object, so if you then call predict it'll use those updated values. The goal is predicting beyond the original horizon without retraining the models (just updating the target values).
a
Makes sense, sounds like
.update
is the right way for me. If I have external features such as a price feature, this should then also be given to the
.update
function?
j
No, we only need to save the target and the dates, the future exogenous features should still be provided through
X_df
a
@José Morales I have implemented this but I'm having some issues. So I have a time series with only target, unique_id and ds columns. The only other features are genereated through the implemented "lag" and "date_features" parameters in
MLforecast().fit()
. I have trained a model until 2024-06-20. Today I then want to use this and therefore need to give the latest data (so the correct lags are used). Should I then provide a dataframe with target, unique_id and ds column to the
.update
function before I forecast as normally? And if I have trained model on "A", "B" and "C" and only forecast for "A" - should I then update model with data for all three or just the one?
j
Should I then provide a dataframe with target, unique_id and ds column to the
.update
function before I forecast as normally?
Yes but only for the dates that the model hasn't seen, since we currently don't validate this (I'll open an issue for it). You can get the last dates the model saw with
MLForecast.ts.last_dates
, that has the last date by id, so when you call update it'll override the values there with the last date of the dataframe you provide to
update
and append the series values to the stored ones.
And if I have trained model on "A", "B" and "C" and only forecast for "A" - should I then update model with data for all three or just the one?
That depends on if you have target transformations or not. If you don't have target transformations then you can update just "A" and use
predict(..., ids=["A"])
to only forecast that one. If you have target transformations you'll need to update all series (but you can still use the
ids
argument of predict to just predict for A).