Hello everyone, First of all thank you for this gr...
# neural-forecast
a
Hello everyone, First of all thank you for this great package, it's nice to see an active developer community around forecasting issues! I've been struggling for a few months with my forecasting problem and I'd like to know if you have any feedback or experience to share with me. I'm in the fairly classic situation of forecasting sales per week for 25k to 35k items (depending on the time of year), and I need to forecast the next 52 weeks. I'm working at a sufficiently aggregated level that I don't have to worry too much about intermittency, but most of the series are less than 2 years old. As is often the case in the retail world, most series have a strong annual seasonality, while being highly correlated with the latest sales/trends observed. This is why auto-regressive algorithms (such as ARIMA or DeepAR) are often the biggest performers on the first time steps, but through error accumulation can produce "worrying" results over the long term. So I've been concentrating more on "windows-based" algorithms, as you call them, and this NeuralForecast package seems just the thing for my experiments! I've started by benchmarking TFT, NBEATSx and NHITS, but I'm finding it hard to stabilize their performance throughout the year. As the product offer can change drastically from one period to the next, and my series are often too short, it's hard for me to tune by cutting series into train/val/test. I prefer to re-simulate my forecasts by positioning myself at several points in the past, loading the references and their historic data available at the time, and re-running a fit and predict. Have you ever had to deal with a case like this? Have you any tips on feature engineering, algorithmic choices or HPOs to share with me? (For example, about the NHITS
n_freq_downsample
, for annual seasonality in weekly :
[52, 26, 1]
or something like that?) Thank you very much 🙂
👍 3
👀 3
r
I think a number of the base transformer models in this library support holidays and such as part of their innate function. I've used some of the transformers provided in this library stand alone and I distinctly remember the seasonality features of them.
k
Hey @Antoine SCHWARTZ -CROIX-, 1. Your challenge with short series is new to us, but your solution of simulating forecasts along time is reasonable. 2. Regarding seasonalities, I suggest you to use calendar variables for the month, or week promotions. Here is an example: https://nixtla.github.io/neuralforecast/examples/hierarchicalnetworks.html 3. Among the most important things that you need to tune are the learning rate, random seed, and size of the model (layers and hidden units). If you can't use the AutoModels.
r
Hi @Antoine SCHWARTZ -CROIX-! How are you? I'm working on a similar problem (about 60k items for 12 weeks) mostly with NBEATSx, but at a very fine granularity level (almost at SKU level) and my experience is somewhat similar. 1. Unstable predictions: This has happened a lot, and I have included fit/predict loops over several points in time to validate and select a model. There's a
cross_validation
method, but sadly don't work at all because it does not retrain the model over several points in time. I resorted to a simple for loop to do that. 2. Short time series: My time series has both short-lived products, and everlasting ones. I gained accuracy when adding past, discontinued products to the model training, and the key was to share product characteristics as static features (such as brand, price range). I have about ~6 years of data with a product cycle of about a year. 3. Co-variables: Price points and discounts have played a role in improving accuracy of some product categories. However, they completely mess up forecasts for some others due to varying elasticity. Solutions I've been testing are: (a) null out prices for known non-elastic products; (b) added a bool variable indicating expected elasticity for each product as static. Having mixed results so far, but with some improvements. I have not tested NHITS or TFT at all, but they look promising as well!
c
Thanks for using our library! Thanks @Rafael Correia Da Silva for your answer, these are very useful tips which we have also followed in several cases. @Antoine SCHWARTZ -CROIX- do the users of your forecasts really use the 52 weekly forecasts? Could it be possible to aggregate them in monthly forecasts? We have seen than in many cases organizations only need a more aggregate level to make decisions (ie production, distribution, etc.). One problem with weekly data is that the seasonality is not very well defined because it shifts slightly every year (for example, many holidays do not happen on the same week). I do not suggest you to use that
n_freq_downsample
for the NHITS. Regarding tips, the previous answers already have some very good insights. Simulating historic forecasts at different points in time is a very common practice, however with your short data it will further limit the length, specially on the earlier cutoffs. A forecast horizon of 1 year already seems long enough to properly evaluate your models, so one cutoff might be enough. Or use multiple cutoffs but give enough historic data to use at least 1*horizon (52) of
input_size
. For HPO, as Kin mentioned, all models have their
auto
version with a default search space. I recommend you to use them initially. Finally, adding exogenous variables can be critical in this case. You can add calendar variables and holiday events as
future
variables, and characteristics of products/categories as
static
variables.
m
@Cristian (Nixtla) With a weekly frequency, since as you say a year is not exactly 52 weeks (but between 52 and 53 weeks), I've found that it is generally a good idea to add 1 week to the input_size. For example, I have a horizon of
52
timesteps, and an input_size of 2 years that I define as
(2 * 52) + 1
. That
+ 1
often makes a big difference (the +1 is valid for 1 or 2 years of input_size, if you have a longer input_size you may need to add a larger value of weeks, you have to look at how many weeks there are in
n
years).
a
Thank you all for your answers. Indeed I didn't specify it, but I do use static features that help categorize the different products, otherwise I imagine it would be very complicated for the models to associate the right seasonality during inference! I've also tried adding dynamic calendar-related features, such as the ISO week number (encoded cyclically), the global temporal index, the local temporal index (i.e. the age of the TS), public holidays, etc., with no noticeable improvement in overall performance.
@Cristian (Nixtla) that's a very good point! Currently, our users would like to have a single solution to meet the needs of the executive (4 to 12 weeks ahead) and operational (52 weeks ahead). To be honest, until now, we've addressed this problem by stacking the forecasts of a DeepAR (reactive and efficient in the short term) and a Holt-Winter (stable and consistent over the long term). But this is not very convenient for a number of use cases, mainly linked to the addition of new covariates.
t
Have you tried boosted trees for this? I find them more stable for short time series if you have good covariates and can do some feature engineering. Long term forecasts can still be an issue though but combining this with hierarchical methods has helped.
a
I've already tested this for shorter horizons, but not for this use case. I'll think about it for later if I have time, thanks!
m
About the fact that the weekly forecasts are tricky because the yearly seasonality is not exact (it can vary between 52 and 53 weeks), if the monthly forecasts are not enough, I suggest a compromise that I've been experimenting with for the past few days: semi-monthly forecasts. In pandas there is the "SMS" frequency (semi-month start) which includes the 1st and 15th day of each month, so we have 2 points for each month and 24 points for each year.
k
It might be better to capture the yearly seasonalities through calendar variables and the futr_exog capabilities