Hello. I have a question about the iTransformer in...
# neural-forecast
u
Hello. I have a question about the iTransformer in neuralforecast. When I use the transformer with the same data, I always encounter the following error at the same point. Upon checking, it turns out that the values in a 32×90×32 tensor are all NaN. What could be the possible reasons for this occurring? Especially, I want to know what kind of things in _base_multivariate.py.
o
It probably means your loss is NaN. From the posted code, I'd guess that most likely culprit is using
identity
as scaler_type. I'd suggest to use
robust
or
standard
and try again.
But I could help more if you post a piece of code that I can run that reproduces the error
Secondly, it seems you're using a custom loss function? This loss function in conjunction with the scaler_type is highly likely to be the culprit here. E..g. try different scaler and loss function (e.g. 'robust' and 'MAE') -> probably no errors.
u
Yes, my loss is nan! Ok, I try it on your suggestion. Thanks !!
We created a custom loss function for business reasons. Instead of forecasting daily values, we aim to predict the total score over 30-day, 45-day, and 60-day periods. From the perspective of the neuralforecast developers, do you think it's feasible to configure the loss function in this way?
o
First, the loss function should inherit from losses.pytorch.BasePointLoss. See also how MAE or RMSE has been defined in losses.pytorch. Second, this loss function doesn't really make sense imho. As you sum all values, there's no real 'incentive' for the network to perform well on individual timestamps. If you want to weight timestamps, I'd use horizon_weight parameter in RMSE, which makes more sense, and then set the weights comparable to your alpha/betta/gamma parameters.
u
Thank you! I'll make the adjustments as per your advice and give it a try. I have one more question. The tensor sizes are as follows: •
insample_y
torch.Size([32, 16, 32])
outsample_y
torch.Size([32, 90, 32])
output
torch.Size([32, 90, 32])
In the
output
tensor, the dimensions are
32 × 90 × 32
. I understand that: •
32
= batch size •
90
= predict length (h) •
32
= What is this?
o
You gave n_series=32 as input
u
Got it! Thanks for everything, Olivier.
👍 1