@Cristian (Nixtla) Yes, it is exactly inspired by Hyperband but focuses on random seed. Unfortunately, even when decreasing the learning rate, the behavior remains similar. In particular, with some seeds, I cannot predict certain seasonal peaks in the time series. Instead, I obtain flatter predictions, even when the number of heads or the size of the hidden layers increases.