Hi, sometimes when I train AutoModels, after a suc...
# neural-forecast
m
Hi, sometimes when I train AutoModels, after a successful nf.fit() when I try nf.predict() it returns NaN. On rare occasions, when re-running the same hyper-parameter tuning it fixes itself for some reason. Is it due to the custom config settings (eg for a specific tune.choice it doesn't work) or something else? Thanks
j
hey. if you use ray (the default) the optimization process has a seed, so it should always produce the same results, unless there's another source of randomness in your code
m
I see, well would you know why this AutoLSTM returns NaN? I tried an AutoRNN and it works just fine (I set max_steps = 10 for debugging purposes, it didn't work for other values). Thanks in advance
Copy code
horizon = 1

config = {
    "input_size": -1,
    "inference_input_size": -1,
    "encoder_n_layers": tune.choice([2, 4]),
    "encoder_hidden_size": tune.choice([100, 200]),
    "context_size": tune.choice([12, 24]),
    "decoder_hidden_size": tune.choice([64, 128]),
    "learning_rate": tune.choice([1e-1, 1e-3]),
    "max_steps": tune.choice([10]),
    "batch_size": tune.choice([16, 32]),
    "scaler_type": tune.choice(['standard', 'robust']),
    "random_seed": 42
}

models = [AutoLSTM(h=horizon, config=config, loss=MAE(), gpus=1)]  
nf = NeuralForecast(models=models, freq='h')

nf.fit(X_train_val)
nf.save('AutoLSTM', overwrite=True)
j
Can you print the best config and try fitting the non auto model with that?
m
Gave it a try but it still outputs NaN:
j
hmm. are you able to reproduce this with fake data so that I can take a look?
m
sure i can try, afterwards did you want me to send over the code and data csv?
j
if you can reproduce it with just code it'd be great, e.g. using something like:
Copy code
from utilsforecast.data import generate_series

series = generate_series(5, freq='D')
👍 1
m
Okay, I believe I've reproduced something similar. We have a dataframe with the variable of interest 'y' and a certain amount of exogenous variables. I train/fit the AutoLSTM and then it doesn't manage to predict something. Let me know if this works!
j
Thanks for the example. Indeed all of the weights are NaN. @Marco do you have time to investigate this/know what's going on here? Here's a minimal example:
Copy code
from neuralforecast import NeuralForecast
from neuralforecast.models import LSTM
from neuralforecast.losses.pytorch import MAE
from utilsforecast.data import generate_series

series = generate_series(1, freq='h', min_length=6000, max_length=6000)

cfg = {'input_size': -1,
 'inference_input_size': -1,
 'encoder_n_layers': 4,
 'encoder_hidden_size': 200,
 'context_size': 12,
 'decoder_hidden_size': 64,
 'learning_rate': 0.1,
 'max_steps': 10,
 'batch_size': 32,
 'scaler_type': 'standard',
 'random_seed': 42,
 'h': 1,
 'loss': MAE(),
 'valid_loss': MAE()}
nf = NeuralForecast(
    models=[LSTM(**cfg)],
    freq='h',   
)
nf.fit(series)
for k, v in nf.models[0].state_dict().items():
    print(f'Pct NaN {k}: {v.isnan().float().mean().item()}')
🙏 1
👍 1
m
I checked, and on my end, the example above works fine, no values are NaN and I can get predictions. However, when I run hyperparameter optimization with Ray, I get OOM errors. On your end, @Micah Denver, did the parameter tuning process works all the way to the end?
j
Do you have access to a GPU instance? It seems to only happen there, if I run it on CPU the weights are ok
Although the prediction is very bad, the minimum is zero and it predicts -18
I think it's maybe the input_size right? 6k is a lot. @Micah Denver can you try with a smaller number? setting
Copy code
'input_size': 20,
'inference_input_size': 20
works fine on this toy example. especially since your horizon is 1 you probably don't need a lot of history
m
Hi, thanks for working on this. @Marco Yes on my side the LSTM works until the end, no visible issues until I try to predict (I use an RTX 3070Ti GPU and it takes 10-20 mins). @José Morales Sounds good, I will give that a try although I’m wondering why all the other AutoModels I tried could handle it.
Update: it worked with input_size = inference_input_size = 24 !
🙌 1
👍 1
Thank you both 🤝