Hi, I am training a TFT model. When I set the max_...
# neural-forecast
f
Hi, I am training a TFT model. When I set the max_steps=10000, everything seems going on well. However, when I add the max_steps to 15000, after a few training round, the final val_loss_epoch turns out to be 0.000 and the trained model predicts are all Nans. Why could this happen? My model setting is:
model = [TFT(
input_size=hist_length,
h=horizon,
max_steps=12000,
hist_exog_list=ex_hist_columns,
futr_exog_list=ex_future_columns,
batch_size=32,
loss=HuberLoss(),
windows_batch_size=64,
inference_windows_batch_size = 64,
num_workers_loader=12,
early_stop_patience_steps=20,
random_seed=1234,
accelerator='gpu',
# scaler_type='robust',
devices=1,
precision='16-mixed'
)]
k
Your training procedure diverged, you may want to reduce the learning rate. You can monitor through this attribute models[0].train_trajectories https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/common/_base_windows.py#L85
👍 1
🔥 1