Slackbot
01/19/2024, 9:04 PMChad Parmet
01/19/2024, 9:06 PMds
), we could do…
fcst = MLForecast(models=models, freq=1, =[1,2])
fcst.fit(X,id_col='unique_id',time_col='ds', target_col='y',static_features=static_features,max_horizon=2 ,dropna=False)
Adding a lagged exogenous feature
Now let’s say we want to add a feature that counts inbound inquires
received by sales in a month
We won’t know this variable at future months and couldn’t calculate it without its own forecast, unlike I think the examples of price catalog or fourier. But we will know it for all historical months (and imagine we’re predicting sales of a product that can take several months to close). So the lagged inquires
in prior months could be informative of future sales. Let’s add it
Prediction
I think I know how to add inquires
to `X_df`: use transform_exog() to generate inquiries_lag1
and inquires_lag2
. Model1 and model2 will each get “sent” the right rows of X_df
to make their predictions.
Training?
I’m unsure how to set up training without leakage using built-ins
Take the training row that has the features and target for one store in December 2023. I believe this row will be used twice in training:
• model1 will be trained to fit December’s target from one month prior (~November anchor date). It can “see” the real values of inquires_lag1
and inquires_lag2
(data from October and November).
• model2 will be trained to fit December’s target from two months prior (~October anchor date). Now it shouldn’t be able to see the real value of inquiries_lag1
on the December row in training data. That data refers to inquiries in November, which is after the October anchor date
How do I set up the model training so when model2 is fitting the December data, it can’t “see” the December row’s value of inquiries_lag1
?
Or am I misunderstanding a piece?
Thanks again!José Morales
01/19/2024, 11:14 PMfrom mlforecast import MLForecast
from mlforecast.feature_engineering import transform_exog
from mlforecast.utils import generate_series, generate_prices_for_series
series = generate_series(2, freq='M', equal_ends=True)
prices = generate_prices_for_series(series)
prices_lags = transform_exog(prices, lags=[1, 2])
series_wp = series.merge(prices_lags, on=['unique_id', 'ds'])
fcst = MLForecast(models=[], freq='M')
series_wp.head()
fcst.preprocess(series_wp, max_horizon=2).head()
In this example the second model uses april's row to predict may.
Please let us know if this helpsChad Parmet
01/22/2024, 5:31 PMpreprocess()
moreJosé Morales
01/22/2024, 7:09 PM