This message was deleted Nixtla Community #neural-forecast

Join Slack

This message was deleted.

# neural-forecast

Slackbot

07/27/2023, 9:07 AM

This message was deleted.

Antoine SCHWARTZ -CROIX-

07/27/2023, 9:22 AM

it just happened again, this time at the

predict

step, still with the

identity

scaler (on DeepAR with negativebinomial):

Copy code

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/neuralforecast/models/deepar.py:487, in DeepAR.forward(self, windows_batch)
    484 output = self.loss.domain_map(output)
    486 # Inverse normalization
--> 487 distr_args = self.loss.scale_decouple(
    488     output=output, loc=y_loc, scale=y_scale
    489 )
    490 # Add horizon (1) dimension
    491 distr_args = list(distr_args)

File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/neuralforecast/losses/pytorch.py:771, in nbinomial_scale_decouple(output, loc, scale)
    769 alpha = F.softplus(alpha) + 1e-8  # alpha = 1/total_counts
    770 if (loc is not None) and (scale is not None):
--> 771     mu *= loc
    772     alpha /= loc + 1.0
    774 # mu = total_count * (probs/(1-probs))
    775 # => probs = mu / (total_count + mu)
    776 # => probs = mu / [total_count * (1 + mu * (1/total_count))]

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Antoine SCHWARTZ -CROIX-

07/27/2023, 9:45 AM

Ok I think that’s the same error you fixed yesterday @Cristian (Nixtla) for the MQLoss

Cristian (Nixtla)

07/27/2023, 2:55 PM

Hi @Antoine SCHWARTZ -CROIX-! Yes, it should be fixed now

Antoine SCHWARTZ -CROIX-

07/27/2023, 3:06 PM

I misspoke, it's the same type of error, but your fix from yesterday doesn't apply to

DistributionLoss

. The problem occurs in the

*_scale_decouple

functions of some distributions when no scaler is applied to the input data. I've managed to fix the negative binomial on the fly, but I doubt it's optimal:

Copy code

def nbinomial_scale_decouple(output, loc=None, scale=None):
    """Negative Binomial Scale Decouple

    Stabilizes model's output optimization, by learning total
    count and logits based on anchoring `loc`, `scale`.
    Also adds Negative Binomial domain protection to the distribution parameters.
    """
    mu, alpha = output
    mu = F.softplus(mu) + 1e-8
    alpha = F.softplus(alpha) + 1e-8  # alpha = 1/total_counts
    if (loc is not None) and (scale is not None):
        mu *= <http://loc.to|loc.to>(mu.device)
        alpha /= <http://loc.to|loc.to>(alpha.device) + 1.0

    # mu = total_count * (probs/(1-probs))
    # => probs = mu / (total_count + mu)
    # => probs = mu / [total_count * (1 + mu * (1/total_count))]
    total_count = 1.0 / alpha
    probs = (mu * alpha / (1.0 + mu * alpha)) + 1e-8
    return (total_count, probs)

Cristian (Nixtla)

07/27/2023, 3:11 PM

@Kin Gtz. Olivares

Cristian (Nixtla)

07/28/2023, 3:11 AM

@Antoine SCHWARTZ -CROIX- scaling is (almost) crucial to have a good performance with distribution losses. We explain this in our latest paper: https://arxiv.org/abs/2305.07089. And it is also suggested in other papers like DeepAR. What is your experience? Have you compared with/without scaling?

Antoine SCHWARTZ -CROIX-

07/28/2023, 8:32 AM

You're right, the results aren't very good without scaling for DistributionLoss, but I had left "identity" in the parameter space to be explored for tuning (as for NHITS) and that's when I came across the error, and I didn't understand. By the way, I think this is the default value too. However, NegativeBinomial, which is often recommended for positive count data, doesn't work well when the input data is centered. On the Pytorch-Forecasting side, they block the option outright: https://pytorch-forecasting.readthedocs.io/en/stable/_modules/pytorch_forecasting/metrics/distributions.html#NegativeBinomialDistributionLoss Unless I'm mistaken, I think it's impossible to control this on Nixtla? So the only option left is to use the traditional minmax_scaler (not minmax1) to hope for satisfying results with negative binomial?

Kin Gtz. Olivares

07/28/2023, 1:26 PM

Hey @Antoine SCHWARTZ -CROIX-, We need to figure out the interaction of the scale_decouple technique and the DistributionLoss For the moment we have the Poisson Mixture working correctly on positive count data. If you would be kind, can you add an issue on this unresolved scale and distribution interaction? https://github.com/Nixtla/neuralforecast/issues

Antoine SCHWARTZ -CROIX-

07/28/2023, 2:06 PM

Thanks @Kin Gtz. Olivares, yes I'll do it as soon as I can! Otherwise, for now, it seems that the negative binomial proposes bad results on my data, no matter which scaler I choose. I suspect a bad interaction somewhere in the code, as this is the one that offers the best performance on the other deepAR implementations I've been able to test (Sagemaker, gluonTS torch & mxnet version, pytorch-forecasting).

Antoine SCHWARTZ -CROIX-

08/01/2023, 5:18 PM

That's it, I've opened 2 issues that summarize the discussions above. Don't hesitate to contact me if you'd like more details!

Cristian (Nixtla)

08/01/2023, 5:18 PM

thanks!

32 Views

Open in Slack

Previous Next