@Cristian (Nixtla)
"The performance of the nhits is very stable for different values of hyperparameters."
This is interesting, is there any reason for this or articles that can explain a bit more?
I was expecting that a "specific" set of parameters will emerge after repeating the N-folds cross validations.
For instance, in the initial folds, I might discover that using the 'Sigmoid' activation function consistently results in the lowest error.
I extended this approach to other parameters, searching for combinations that minimize error.
Once I had narrowed down the list of parameter values to a smaller subset, I expected to find better models.
However, as you pointed out, it's entirely possible that "Models with significantly different configuration can produce very similar good results"
If there are any resources or articles that delve into why this occurs, I would greatly appreciate to learn more about this topic.