Hey team, I have a question wrt mlforecast fit and...
# mlforecast
f
Hey team, I have a question wrt mlforecast fit and the use of a validation size. Currently the fit() method of the MLForecast class does not support validation size. In neuralforecast there is a val_size parameter for it. Is there any way to achieve a similar behaviour in MLForecast to get early_stopping to work? Or is there a reason why this cannot work at all?
đź‘€ 1
j
Hey. The problem is that since we can take any scikit-learn compatible model there's no easy way to do it, because some models don't work on a per-iteration basis (e.g. RandomForest), others do but have different interfaces for providing validation sets and early stopping rounds (e.g. GBDTs). If you want something that works out of the box you can try LightGBMCV, which implements early stopping based on the loss of the full forecasting horizon. If you want to use a different model you can follow this guide to generate the training set with MLForecast and then split it any way you want, train the models with their native APIs and then give them back to mlforecast for the forecasting step.
f
Awesome, thanks for the quick answer! That's what I was already discussing with my coworkers, glad to see I was on the right track, but also sad there is no shortcut 🙂 Would it benefit someone or is there a place to provide the final code as an example or sth?
j
Do you mean you want us to provide an example or do you want to share your code as an example?
f
No if I manage to do it, should I post it as an example somewhere?
j
I think we could add it as an extra case in the custom training guide, i.e. case 1 using weights and case 2 performing early stopping
you could also define a class that wraps the model you're training and defining the val size on init and performing the early stopping during fit, that'd allow you to just pass the model to MLForecast.fit and have it early stop automatically
👍 1
j
maybe i am a bit late here, but we recently implemented early stopping with optuna. i am still using manual tuning (as when i started to build it, there was no automl fc), but since i just discovered there is no possibility to define val_length, we might stick to this. anyway, here is a simple early stopping logic (based on percentage incremental increase):
def early_stopping(self, study, trial):
current_value = trial.value
improvement = self.best_value / current_value - 1
if improvement > self.percentage:
self.best_value = current_value
self.best_step = trial.number
elif trial.number - self.best_step >= self.patience:
print(f"Early stopping triggered at trial {trial.number}.")
study.stop()
then you can run it like this:
def run_optuna(self, n_trials: int = 20) -> Dict[str, Any]:
"""
Initiates Optuna optimization to find the best hyperparameters.
Parameters:
- n_trials (int, optional): The number of optimization trials. Default is 20.
Returns:
- Dict[str, Any]: A dictionary containing the best hyperparameters found during optimization.
"""
study = optuna.create_study(
direction="minimize", pruner=optuna.pruners.MedianPruner()
)
study.optimize(self.objective, n_trials=n_trials, callbacks=[self.early_stopping])
#study.optimize(self.objective, n_trials=self.n_trials, )
return study.best_params