Is there a way to access multiple top performing t...
# mlforecast
b
Is there a way to access multiple top performing trials for the Auto formulas? auto_mlf.results_['AutoLightGBM'].best_trial.user_attrs['config'] only gives the first one, however I want to determine per unique id in cross validation which configuration is the best.
j
this is rather untypical to do. it is more common to find best params for 2-3 models and then apply these models to the ids that work best. but i am also not sure if this is automatically implemented in mlforecast.
b
I have indeed found the best params for 4-5 different models and apply these models to all ids, as I need that in order to be able to reconcile to a total level later on. Some unique IDs perform worse on the overall best params, therefore I thought it was an idea to find top 5 config and perform cv per unique id
o
I believe you're over thinking this - you'll always find a model that performs best on a certain unique_id but worse on another than another model. By trying to select for every unique_id the best model from cross-val, you are extremely likely to overfit and perform worse in a true test setting. Bottom line: keep things simple. Just cross-val on a few models using hyperparam search, select the best model, create predictions. This is much more likely to lead to good test results than doing a micro optimization per unique_id.
j
Very true. I think he struggless atm because his arima is better than all ml models 🙃
b
Hi, thanks for your response. Currently I have one configuration per ML model (I have four different ML models), I thought about using 3/4 configurations per ML model and cross validating them separately from the fit function per unique ID. Like Jan indeed mentioned, I am struggling because my arima is wayyyy better than all models and I am trying to figure out what exactly is I can do to improve the current ML Models.
j
have you looked at some plots. often it helps so much to understand where is error is happening. if arima is so much better i actually assume that thhe data is not so hard to forecast and follows some clear seasonal patterns overall. maybe your ml models need more specific features. maybe by plotting you understand that you need more lags or different calender features or so. maybe also share some plots here if you want more feedback.
👍 1