Greetings! Two questions about the excellent neura...
# neural-forecast
e
Greetings! Two questions about the excellent neuralforecast packages. 1. Embeddings for categorical variables appear to have been deprecated. Is this temporary? If not, why was this choice made? 2. Are there plans to introduce probabalistic losses for count targets? (i.e. negative binomial, tweedie distributions for count valued time series).
🎉 1
k
Hi @Eric Braun, 1. I am currently working on the latter, to include probabilistic outputs on all the neuralforecast models, in this branch. I might have something working by the end of this week. I will let you know. If you have links to PyTorch implementation of tweedie and zero inflated losses we would appreciate it. 2. Regarding the categorical embeddings, depending on the model some have and others don't. We tried to keep as parsimonious as possible the baseline models (LSTM, RNN, MLP, TCN, ...). I am considering returning categorical encoders to NBEATSx, and NHITS. 3. If you need a model with categorical embeddings take a look to our temporal fusion transformer TFT.
e
@Kin Gtz. Olivares Awesome! RE: tweedie - the third place solution to the M5 competition used deepar with a custom tweedie loss: https://github.com/devmofl/M5_Accuracy_3rd
k
Cool, thanks for the pointer @Eric Braun
Hey @Eric Braun, If you want to check, we added to main a first iteration of
DistributionLoss
, here: • https://github.com/Nixtla/neuralforecast/commit/f6b7102299970c6a2810692557096b4a6975f1fa • https://github.com/Nixtla/neuralforecast/blob/main/nbs/losses.pytorch.ipynb We are still polishing the interaction of the data's scale and the DistributionLoss scale parameters. For count data, the
DistributionLoss(distribution='Poisson')
in combination with 'ReLU' encoders networks (
MLP
,
NBEATS
,
NHITS
,
TCN
) was giving me reasonable results already.
e
@Kin Gtz. Olivares Thanks! I'm looking forward to digging in.
k
We are still working out some details around the parameters as outputs of the network. It might take some weeks.
Early results with the Normal and StudentT seem promising
e
Understood. Does that mean you've seen an issue with the Poisson distribution?
k
I am exploring an optimization technique that decouples scale/location from the parameters estimation. The parameters need to be restricted to reasonable domains (Poisson rates need to be positive for example). If you combine a scaler with the DistributionLoss the normalization and the domain constraints interact poorly.
e
I'll be greatly interested to see how that work pans out! Are you aware of how other major DL forecasting libraries, like gluonts, deal with the issue?
k
GluonTS also has a decoupled optimization strategy that helps its networks with the distribution parameters estimation. We are taking some inspiration from it, but I intend to improve some details.