Akmal Soliev
11/20/2023, 4:42 PMunique_id
does not play any role in the following model and can be simply set as "shirts"
, with overlapping/repeating dates (ds
) from previous shirt sales?
NOTE: There are exogenous variables available (for the sake of an example), such as weather, humidity and number of people in the city.
Example application using AirPassenger dataset:
import pandas as pd
import pytorch_lightning as pl
import matplotlib.pyplot as plt
from neuralforecast import NeuralForecast
#from neuralforecast.models import TFT
from neuralforecast.losses.pytorch import MQLoss, DistributionLoss, GMM, PMM
from neuralforecast.tsdataset import TimeSeriesDataset
from neuralforecast.utils import AirPassengers, AirPassengersPanel, AirPassengersStatic
#AirPassengersPanel['y'] = AirPassengersPanel['y'] + 10
Y_train_df = AirPassengersPanel[AirPassengersPanel.ds<AirPassengersPanel['ds'].values[-12]] # 132 train
Y_test_df = AirPassengersPanel[AirPassengersPanel.ds>=AirPassengersPanel['ds'].values[-12]].reset_index(drop=True)
Modification:
Y_train_df["unique_id"] = "Airline1"
Y_test_df = Y_test_df[lambda x: x["unique_id"] == "Airline1"]
Modeling:
nf = NeuralForecast(
models=[TFT(h=12, input_size=48,
hidden_size=20,
#loss=DistributionLoss(distribution='Poisson', level=[80, 90]),
#loss=DistributionLoss(distribution='Normal', level=[80, 90]),
loss=DistributionLoss(distribution='StudentT', level=[80, 90]),
learning_rate=0.005,
stat_exog_list=['airline1'],
#futr_exog_list=['y_[lag12]'],
hist_exog_list=['trend'],
max_steps=500,
val_check_steps=10,
early_stop_patience_steps=10,
scaler_type='robust',
windows_batch_size=None,
enable_progress_bar=True),
],
freq='M'
)
nf.fit(df=Y_train_df, static_df=AirPassengersStatic, val_size=12)
Y_hat_df = nf.predict(futr_df=Y_test_df)
Cristian (Nixtla)
11/21/2023, 4:25 PMunique_id
to predict a new article (in fact, any model would work). However, the change should only be done during the predict
step. My recommendations are:
1. Fit your model on the historical values of your existing shirts, as you would normally do. There is no need to change any unique_id
. Do not change and group the different products in the same unique_id
. Each unique_id
should only have one value for each ds
.
2. When using the predict
method, pass a new df
using the appropriate values for y
from a different article. For instance, set the dates for the current year but take the values for y
of the existing similar shirts from the previous year. (Of course, this approach is making many assumptions to work properly).
Let me know if this is clear.Akmal Soliev
11/21/2023, 4:31 PMYes, it should be possible to use the sales of a differentIn this case if I have passedto predict a new articleunique_id
blue shirt
red shirt
white shirt
as train unique id I'd still be able to predict yellow shirt
?
I think if there was a minimal example that would help a ton!
Thanks Christian, struggling to wrap my head around this idea.Cristian (Nixtla)
11/21/2023, 4:39 PMdf
in the predict
method. The unique_id
of the df
does not have to be contained in the train data.
By training on blue shirt
red shirt
white shirt
, etc. the model has learned to forecast multiple types of shirts. You can then construct your df
for the predict method with a new unique_id
, for example yellow_shirt
, and take the y
values of a "very similar" shirt. Or maybe produce multiple forecasts for several "very similar" shirts and then take the average/mean.df
dataset in the predict
method. Its the same principleAkmal Soliev
11/21/2023, 5:14 PM1. When using theokay i literally just had to change the parameter š¤¦, was so confused on why you saidmethod, pass a newpredict
using the appropriate values fordf
from a different article. For instance, set the dates for the current year but take the values fory
of the existing similar shirts from the previous year. (Of course, this approach is making many assumptions to work properly).y
df
š
Does it work with exo variables?
I attempted to run this on Airpassanger example from above with the following variation to Y_test_df
:
unique_id ds y trend y_[lag12]
0 TestingAirlines 1960-01-31 417.0 132 360.0
1 TestingAirlines 1960-02-29 391.0 133 342.0
2 TestingAirlines 1960-03-31 419.0 134 406.0
3 TestingAirlines 1960-04-30 461.0 135 396.0
4 TestingAirlines 1960-05-31 472.0 136 420.0
I am getting the following error:
Exception: {'airline1'} static exogenous variables not found in input dataset
Cristian (Nixtla)
11/21/2023, 6:48 PMAkmal Soliev
11/21/2023, 8:29 PMunique_id
Cristian (Nixtla)
11/21/2023, 10:22 PMyellow_shirt
will be identical to the blue_shirt
.
However, depending on your exogenous features you might have different behaviour (such as weather or number of people for the new year).
Another option as I mentioned is to take some form of aggregation from multiple forecasts for different similar products (it has a k-nearest neighbor flavor).Akmal Soliev
11/22/2023, 11:50 AMCristian (Nixtla)
11/23/2023, 7:15 PM