This message was deleted Nixtla Community #mlforecast

Join Slack

This message was deleted.

# mlforecast

Slackbot

01/19/2024, 9:04 PM

This message was deleted.

Chad Parmet

01/19/2024, 9:06 PM

Problem set up Say we are predicting sales per store per month. • We want to predict sales 1 and 2 months into the future. • We’re using a one-model-per-step approach. We’ll train two models: One will predict sales for February, the other will predict sales for March. With just “regular” static features (and dynamic features that are known for future

ds

), we could do…

Copy code

fcst = MLForecast(models=models, freq=1, =[1,2])

fcst.fit(X,id_col='unique_id',time_col='ds', target_col='y',static_features=static_features,max_horizon=2 ,dropna=False)

Adding a lagged exogenous feature Now let’s say we want to add a feature that counts inbound

inquires

received by sales in a month We won’t know this variable at future months and couldn’t calculate it without its own forecast, unlike I think the examples of price catalog or fourier. But we will know it for all historical months (and imagine we’re predicting sales of a product that can take several months to close). So the lagged

inquires

in prior months could be informative of future sales. Let’s add it Prediction I think I know how to add

inquires

to `X_df`: use transform_exog() to generate

inquiries_lag1

and

inquires_lag2

. Model1 and model2 will each get “sent” the right rows of

X_df

to make their predictions. Training? I’m unsure how to set up training without leakage using built-ins Take the training row that has the features and target for one store in December 2023. I believe this row will be used twice in training: • model1 will be trained to fit December’s target from one month prior (~November anchor date). It can “see” the real values of

inquires_lag1

and

inquires_lag2

(data from October and November). • model2 will be trained to fit December’s target from two months prior (~October anchor date). Now it shouldn’t be able to see the real value of

inquiries_lag1

on the December row in training data. That data refers to inquiries in November, which is after the October anchor date How do I set up the model training so when model2 is fitting the December data, it can’t “see” the December row’s value of

inquiries_lag1

? Or am I misunderstanding a piece? Thanks again!

José Morales

01/19/2024, 11:14 PM

Hey. Thanks for using mlforecast. When we build the target we take each row and add the future targets, not the past ones. So I believe the statement is: • model1 uses december's row to predict december • model2 uses december's row to predict january from next year You can see what the training set looks like by running preprocess:

Copy code

from mlforecast import MLForecast
from mlforecast.feature_engineering import transform_exog
from mlforecast.utils import generate_series, generate_prices_for_series

series = generate_series(2, freq='M', equal_ends=True)
prices = generate_prices_for_series(series)
prices_lags = transform_exog(prices, lags=[1, 2])
series_wp = series.merge(prices_lags, on=['unique_id', 'ds'])
fcst = MLForecast(models=[], freq='M')
series_wp.head()
fcst.preprocess(series_wp, max_horizon=2).head()

In this example the second model uses april's row to predict may. Please let us know if this helps

Chad Parmet

01/22/2024, 5:31 PM

That's awesome! Thanks a bunch @José Morales this is really helpful! And thanks for correcting my description of how dates map to targets, that was key I ran this and it makes sense (with a slight edit for my example to get prices on monthly vs daily frequency). I can see, I don't need to mask any of the lagged features during training of a one-model-per-step forecast, because each model's target col gets shifted to line up the model's target with the row of features it should be able to see. And that of course masks all "future" lag features That's fantastic. Thank you again for talking me through it! And a great reminder I should use

preprocess()

José Morales

01/22/2024, 7:09 PM

Glad to be of help. Let us know if you have any more questions or run into any issues

gratitude thank you 1

2 Views

Open in Slack

Previous Next