Antoine SCHWARTZ -CROIX-
07/27/2023, 9:07 AMRuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
I feel like it's random, however in my last 2 HP tuning, it happened with the scaler_type
set to identity
(random or root cause, i don't know).
Am I the only one ?
Thankspredict
step, still with the identity
scaler (on DeepAR):
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/neuralforecast/models/deepar.py:487, in DeepAR.forward(self, windows_batch)
484 output = self.loss.domain_map(output)
486 # Inverse normalization
--> 487 distr_args = self.loss.scale_decouple(
488 output=output, loc=y_loc, scale=y_scale
489 )
490 # Add horizon (1) dimension
491 distr_args = list(distr_args)
File /local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/neuralforecast/losses/pytorch.py:771, in nbinomial_scale_decouple(output, loc, scale)
769 alpha = F.softplus(alpha) + 1e-8 # alpha = 1/total_counts
770 if (loc is not None) and (scale is not None):
--> 771 mu *= loc
772 alpha /= loc + 1.0
774 # mu = total_count * (probs/(1-probs))
775 # => probs = mu / (total_count + mu)
776 # => probs = mu / [total_count * (1 + mu * (1/total_count))]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Cristian (Nixtla)
07/27/2023, 2:55 PMAntoine SCHWARTZ -CROIX-
07/27/2023, 3:06 PMDistributionLoss
. The problem occurs in the *_scale_decouple
functions of some distributions when no scaler is applied to the input data.
I've managed to fix the negative binomial on the fly, but I doubt it's optimal:
def nbinomial_scale_decouple(output, loc=None, scale=None):
"""Negative Binomial Scale Decouple
Stabilizes model's output optimization, by learning total
count and logits based on anchoring `loc`, `scale`.
Also adds Negative Binomial domain protection to the distribution parameters.
"""
mu, alpha = output
mu = F.softplus(mu) + 1e-8
alpha = F.softplus(alpha) + 1e-8 # alpha = 1/total_counts
if (loc is not None) and (scale is not None):
mu *= <http://loc.to|loc.to>(mu.device)
alpha /= <http://loc.to|loc.to>(alpha.device) + 1.0
# mu = total_count * (probs/(1-probs))
# => probs = mu / (total_count + mu)
# => probs = mu / [total_count * (1 + mu * (1/total_count))]
total_count = 1.0 / alpha
probs = (mu * alpha / (1.0 + mu * alpha)) + 1e-8
return (total_count, probs)
Cristian (Nixtla)
07/27/2023, 3:11 PMAntoine SCHWARTZ -CROIX-
07/28/2023, 8:32 AMKin Gtz. Olivares
07/28/2023, 1:26 PMAntoine SCHWARTZ -CROIX-
07/28/2023, 2:06 PMCristian (Nixtla)
08/01/2023, 5:18 PM