I've been testing neuralforecast for a few months ...
# neural-forecast
I've been testing neuralforecast for a few months now, and I think it's a great library. The thing that bothers me the most is that I found the algorithms to be not very "robust" (at least with my data), in the sense that I spent literally days looking for a good set of hyperparameters for the various algorithms, but then retraining the model with 1 week of new data using the same set of hyperparameters, the model predictions became very poor. Moreover, it is enough to change the random seed (even keeping the same data) to go from very good predictions to very poor ones. In practice, I am forced to search for a new set of hyperparameters (including the random seed) every time I add new data. In general, with ML algorithms for tabular datasets (not time-series forecasting), I've always set a seed to make the results reproducible, but I've never found that the choice of seed impacts the final predictions so much (generally, although starting from differently initialized neural networks, the training procedure still led to models that generated similar predictions). In contrast, with the time series forecasting algorithms implemented in neuralforecast, there seems to be a kind of local minima nightmare for which a different random seed or a small change in the training data can lead to very different predictions for the same hyperparameters. Even a few more or fewer training steps can make a huge difference... it all seems very unstable and swinging. Do you have any suggestions on this (I'm currently using TFT) or do you think it is a difficult problem to solve? Thanks
Hey Manuel, if you set the random_seed AND use the pl.trainer option « deterministic » (set to True on CPU, and « warn » on GPU) you will be able to have deterministic predictions! 😉
@Antoine SCHWARTZ -CROIX- Yes, the problem is not that they are not deterministic, the problem is that just changing even a little bit of the training data or the random seed or increasing/decreasing the number of steps by a little bit can lead to very different predictions.
Hey @Manuel, Some ideas that can help to robustify the training procedure: • Early stopping is one of the strongest regularization techniques on regression problems. Are you using it? • Pretraining on a large dataset and fine tuning on your data. • Exploring the complexity of the model, if you have limited data you might one to make the model smaller. • Using HuberLoss/HuberMQLoss, that helps the model's convergence and clips the gradients.
@Kin Gtz. Olivares Thanks! Currently, I'm already using HuberLoss and I'm using a custom early stopping criteria where the validation set is made by some cherry picked AutoARIMA forecasts for the most important time series. The reason is that many time series are quite short and I can't sacrifice a full horizon of timesteps to be used as validation set, otherwise the remaining data is not enough to produce meaningful forecasts (I'm using
to incrementally fit the model while checking for the custom early stopping criteria).
The two remaining ideas for helping the stability are: • As long as the series are similar to your problem, pre-training on a larger dataset may help. • Using very simple predictions (SeasonalNaive/Naive.) through
that anchor the NeuralForecast model to learn residuals. Learning residual predictions is a much simpler problem.
👍 1
Hi @Manuel. Kin already suggested a lot of great tips. Bad local minima and "chaotic" behavior when changing hyperparameters or the data is one of the main drawbacks of deep-learning methods. In our experience some methods are very robust to these changes. For example, this is our main table of the NHITS paper, std of the performance between different runs (with the Auto models that change hyperparameters) are in parenthesis. As you can see, the performance of many models such as the NHITS is very stable (on average). Another common practice to have more stable predictions is ensembling multiple models, for example initializing each at a different random seed.
But in this cases we have a rather long validation set, so the regularization of the early stopping and validation signal are very stable, leading to more stable results and forecasts
@Cristian (Nixtla) Thanks! Unfortunately NHITS does not perform well on my specific data set. The one that seems to perform better is TFT but it is not very "stable" (even with NBEATSx I encountered little "stability"). In the case of TFT the tuning of hyperparameters is also problematic because it is very slow (even using 1 GPU) and optimizing them takes me days each time.
early stopping is one of the strongest regularization techniques on regression problems. Are you using it? Pretraining on a large dataset and fine tuning on your data. could please tell me how can i use the early stop i try to change the param and give it 2 for example but idont know if that is correct and also about pretraining do you mean should train my model on another data and transfer the learning to my data ?? @Kin Gtz. Olivares