Hey! I have a question regarding hyperparameters t...
# neural-forecast
a
Hey! I have a question regarding hyperparameters tuning with Optuna. I am using multiple (4) GPUs to distribute the training of my AutoTFT, and thus, at the end of each trial, I find myself with multiple validation loss values (see screenshot). How can I interpret it? Also, on the screenshot, we can see that we have different best trials (namely trial 0 and trial 2), causing to have an error during the initialization of the model with the best hp combination:
RuntimeError: [2]: params[0] in this process with sizes [6, 96] appears not to match sizes of the same param in process 0.
Would you have any idea on how to tackle this, while maintaining the parallelization? Thanks a lot!
j
Hey. You need to set the seed of the sampler to ensure that all processes get the same configuration, e.g.
Copy code
AutoTFT(..., search_alg=optuna.samplers.TPESampler(seed=0))
a
Hi José, thank you for the answer! I am already using the seed setting in my codebase:
Copy code
models = [
            AutoTFT(
                h=config.forecast.horizon,
                loss=LOSSES[config.model.loss.name](**config.model.loss.kwargs),
                backend="optuna",
                search_alg=TPESampler(seed=0),
                gpus=config.model.devices,
                config=config_tft,
                num_samples=3
            )
        ]
j
ah I see, the error happens when training the best model at the end because they each choose a different one. Are you using v1.7.1?
a
I am using 1.7.0
j
Can you upgrade to 1.7.1? We added
sync_dist=True
https://github.com/Nixtla/neuralforecast/blob/9d3f393ce5990603f0da8ed103a67e76a090676c/neuralforecast/common/_base_model.py#L324 which may help them all get the same score in each trial
a
Will try! Thanks for the support!
It seems to work! Thanks a lot! Also, super nice to have the logs while training now
🙌 1