Hi I have a question if it is possible to have imp...
# neural-forecast
c
Hi I have a question if it is possible to have implicit quantile networks in neural forecast or implement it myself without too much hassle. The idea is that I use a (custom) distribution to sample tau, our quantile level, multiply it with a cosine function, feed it into a linear embedding, feed it into an activation function and then concatenate it to my input sequences. The loss function is a quantile or expectile loss with tau as the quantile level. Why do I want it, it's much more parameter efficient. Lets Assume I have an lstm model, where I forecast a week of hourly values ahead, that's 168 values. I want to have 10 to 20 quantiles. With normal mq loss it will blow up my network parameters. With this it won't. Then during runtime I want to conformalize each of these quantiles using a separate calibration set, that I update using the test set. If this is possible with nixtla I can use your incredibly optimized codebase and don't even need a GPU. Thanks in advance.
o
It's already there, use
IQLoss
(PyTorch Losses - Nixtla) as loss function during training
❤️ 1
Note that I'd personally use HuberMQLoss instead because it gives the best performance, but ymmv
With IQLoss, after training you can sample any quantile(s) by simply calling it in predict, e.g.
nf.predict(...., quantiles=[0.1, 0.2])
or use the level argument to get any level
nf.predict(...., level=[80])
I don't think it works jointly with conformal though. Kind of weird to do it like that, if you want to do conformal there's no point in even doing the
IQLoss
, just do
HuberLoss
during training and conformalize thereafter.
c
Yea but huberloss may have more parameters depending on the model architecture. With few adaptations you can also 'huberize' the IQ loss, which may be interesting
o
HuberLoss() typically has only one parameter (or q parameters for the number of quantiles). The issue with IQLoss is that the quantiles are typically too narrow, but again ymmv, and try out different settings for the beta distribution. And yes, we can huberize IQLoss 🙂
❤️ 1
c
Maybe an interesting idea, I had the same too with own (non-nixtla) experiments and iql. That the PI's became a bit too small. Instead of sampling the random quantile from a uniform distribution, I sampled from a mixture of two gaussian with the means near 0 and 1, and rejecting anything beyond that. This worked well. Another option I was thinking about would be to take a uniformly sampled value and transform it via a very small adversially trained network. Then this value can be used for the original iqr embedding and loss