Hi everyone, I'm hoping to pick you brains on this...
# neural-forecast
p
Hi everyone, I'm hoping to pick you brains on this business problem I am trying to solve and the best modeling approach based on the tools you have all developed. I have a hierarchical system with roughly 130 daily timeseries. Although these timeseries follow a hierarchical structure, we are not interested in their mean forecast. We are more interested in their 95th quantile forecast. To add to the difficulty, we only have about 2.5 years ~ 914 days of historical data, yet we want to forecast 18 months in the future. horizon = 18*30 = 540 days. 1. Currently, I am trying to use a NHITS base network where the validation loss is
sCRPS
with quantiles = [0.95] with a HINT class. My hope is that reconciling the distributions across the hierarchy might result in better estimates of the 95th quantiles. 2. Since the 95th quantiles themselves can't be reconciled. i.e. the sum of the 95th quantiles of the bottom nodes is not equal to the 95th quantile of the parent nodes. Am I better off getting rid of the HINT layer altogether? 3. Can anyone provide a heuristic guide to tuning NHITS? I want to get a feel for the hyper-parameter space before I use Ray to fine tune a model?
k
Hey @Phil, Thanks for using the HINT model. Regarding quantile and HINT coherence: • HINT model samples are coherent by construction. • The 95th quantiles for the marginal distributions (each individual series themselves) are not guaranteed to satisfy coherence. • Quantiles are usually univariate by definition, I suspect one could try to build elliptic sets based on covariances to achieve 95th coverage. But it is speculative. Your question/requirement is really interesting. • An interesting idea would be to condition on the total aggregated series to hit the 95th quantile and see what are the samples that achieved that. You would need of course to reduce somehow the variance of the samples. Regarding the limited data that you have: • I suggest to start simple, predicting 18 months ahead with such limited data might be challenging. Without enough data NHITS might be capturing a Naive1 prediction and if you are lucky a simple trend in your data. • You might want to check if there are larger datasets that share nature to yours and see if you can apply pre-train the models on them. On the hyperparameters of HINT/NHITS: • My recommendation is to always start with random seeds and learning rates. (Still you may have very limited validation signal). • After that you can explore the complexity of the model through the layers depth and number of hidden units.
p
Thank you for your response Kin! I'm definitely observing the "Naive1" prediction. If forecasting the raw signals and reconsolidating is not performing well, I might preprocess all the timeseries in the hierarchy by taking a rolling p95 quantile and forecasting these metrics directly. I won't be able to use HINT in these cases but the signals should be smoother and the model would not have to learn seasonality patterns as much.
k
There is an interesting trend, you might want to try NBEATSx with a linear trend stack
Predictions will be mostly the trend and the Naive1 level
But for this amount of data, I think that is a reasonable expectation
Here is an example of NBEATSx, you can change the n_polynomials=1
Another recommendation I can make is for you to use the HuberMQLoss, that way the spikes in your data won't affect so much the predictions: https://nixtla.github.io/neuralforecast/losses.pytorch.html#huberized-mqloss
n
Hi, for this business problem, could it be a good idea to do a temporal hierarchical forecasting? Grouping data by day, week and months. Forecast each "group" and reconciliate. I'm not sure if it can help.
k
I am always of the recommendation to start with simple baselines and build on top of them. Otherwise it is difficult to tell if more complicated ideas are helping the forecasting task.
👍 1
p
Thank you for the tips. I'll try NBEATSx next