Hi, is anybody aware of models that are built to h...
# general
b
Hi, is anybody aware of models that are built to handle exogenous features that should have different weights depending on where in the forecast horizon we are forecasting? For example, with a simple demand forecasting model where I have an autoregressive component and one regressor for orders already booked. If I train an ARiMAX model with a horizon of 6 months, the amount already booked for h=1 might carry a lot of signal, while the amount already booked for h=6 might carry no/low signal, so just passing a booked_orders regressor does not work well. Are there any modeling frameworks that account for this? The best middle-ground I've been able to hack together on my own involves a recursively retraining models with a one-step horizon and updating the regressor along the way (booked_1_month_ahead, booked_2_month_ahead, etc), but this feels pretty inefficient. Interested in how others may have dealt with similar problems in the past and open to stats, ml, or deep learning solutions
o
You could train a neuralforecast model like NHITS, TSMixerx, BiTCN - basically anything that can handle exogenous features (see here). I don't think you'd need to do anything special to handle your situation. So just prepare your data, and make a forecast for horizon=6, for example. Try it out using the Quickstart
b
So if the data looks like this (red is the forecast horizon), would I pass all 6 exogenous features? My concern is that the models might fit strongly on
booked1
, but then not know what to do when h=2 and it is null
o
Ok, thanks for sharing the picture, I better understand what it is you're trying to achieve now. So the issue is that some of the future values for these exogenous are not available at the time of prediction. Some options to create a forecasting model in NF with this data (in order of easy -> hard to implement): 1. Consider these exogenous purely 'historical' first, that is, we ignore that you have some knowledge of them in the future (i.e. effectively ignoring the values in the red). You can do this by setting
hist_exog_list = ['booked1', 'booked2', ....]
in a NF model that supports historical exogenous variables (see the first link I shared above) 2. Consider the future values known by treating them as future exogenous variables, setting
futr_exog_list = ['booked1', 'booked2', ....]
in a NF model that supports future exogenous variables. For the prediction phase, you will need to make predictions for the unknown values of these features, i.e. fill up the blank spots of the red part in your picture. Just start with something simple, e.g. (seasonal) naive prediction (I'd just start with forward-filling the unknown values, for example). 3. For each exogenous column, randomly set values to Null with probability n / 6, with n the amount missing in the prediction horizon. So for
booked1
, you'd set 5/6 values in the training data Null. Next, add an indicator column
booked1_missing
that indicates whether the value in
booked1
is Null or not. Do the same for all exogenous columns but change the missingness amount based on the probability of the value appearing in the prediction horizon. Now assign a dummy value to the Null values, e.g. -1. Add all the exogenous variables (so both the
booked1
and
booked1_missing
) as a future exogenous (i.e.
futr_exog_list = [ ]
). During prediction, use the values you have available, and don't forget to convert the prediction values (i.e. set Null to -1, and add the missingness column). This approach tries to simulate the availability of these exogenous of the prediction phase during training. Note that
booked6
seems always available, so you don't have to do anything for that variable. Hope this helps
b
All interesting approaches, but I was hoping for something where I could use all of these datapoints for both training & predicting. The best approach I have so far is: 1. Train a model (ARIMAX) with booked 1 as the regressor, fit for h=1 2. Add the forecast from step 1 to the training data as y 3. Train a model (ARIMAX) with booked 2 as the regressor, fit for h=1 continue until I get through the whole horizon. This approach lets me utilize every booked value & it's relationship to the eventual y value. It just feels very inefficient.
o
Option 2 lets you use all the datapoints, likely gives better results and is a lot easier than what you are suggesting. But all the options I provided will very likely result in a better forecast than what you are proposing (whilst being substantially easier to implement).