Hi team I ve used the AutoMLForecast AutoLightGBM combinatio Nixtla Community #mlforecast

Hi team, I've used the AutoMLForecast - AutoLight...

Sarim Zafar

06/18/2024, 9:03 PM

Hi team, I've used the AutoMLForecast - AutoLightGBM combination, as suggested in the example, to train on a time series. However, when I attempt to reproduce the results using the cross-validation function on a custom LGBM instance with the discovered optimized hyper-parameters and lag features, I am unable to achieve the same performance. My assumption is that the model applies some target transformation based on season length, but I am not certain. Could anyone provide clarification on this issue? Additionally, how can one specify target transformations as part of the

my_init_config

function, as shown in the example on the website? When I use a simple log-difference combination as I typically do with Cross Validation, the loss function returns NaN. For the loss function, I am using MAE as described here: def custom_loss(df, train_df): return mae(df, models=["model"])["model"].mean() Any guidance on these matters would be greatly appreciated. Thank you!

José Morales

06/18/2024, 9:14 PM

Hey. How are you running the custom LGBM instance? You should be able to get the same result with something like this:

Copy code

best_config = auto_mlf.results_['AutoLightGBM'].best_trial.user_attrs['config']
my_lgb = LGBMRegressor(**best_config['model_params'])
my_mlf = MLForecast(models=my_lgb, freq=my_freq, **best_config['mlf_init_params'])
my_mlf.cross_validation(auto_settings like windows, h, refit)

How do you specify the log difference? As

[GlobalSklearnTransformer(FunctionTransformer(np.log1p, np.expm1)), Differences(...)]

Sarim Zafar

06/18/2024, 10:14 PM

Yes I specify the log difference as you suggested:

Copy code

target_transforms=[GlobalSklearnTransformer(FunctionTransformer(func=np.log1p, inverse_func=np.expm1)),
        Differences([season]),]

Sarim Zafar

06/18/2024, 10:17 PM

As for the usage i am using LightGBMCV with the model params and input features with the same number of windows and horizon

José Morales

06/18/2024, 10:30 PM

The default of the auto model is not to refit. Are you setting

refit=False

in the cross_validation call?

Sarim Zafar

06/19/2024, 7:53 AM

Yes I've done so

Sarim Zafar

06/19/2024, 2:46 PM

Okay so upon further inspection I have realised that if you provide your own config in fit it wont apply any target transforms. Which is now even more confusing as if i were to use the code as you suggested above I should be able to recreate the performance but I am unable to. And I am still struggling to use my own target transforms as that simply leads to -inf loss

José Morales

06/19/2024, 2:54 PM

The target transforms are provided in the init config. Do you have values less than zero in your data? I'm not sure if log1p raises an error or returns NaN for negative values

Sarim Zafar

06/19/2024, 2:56 PM

if i provide my own init_config it overrides and skips the self._seasonality_based_config function call

Sarim Zafar

06/19/2024, 2:56 PM

so now I am back to square one given the best params are correct why can I not re-produce it using a simple MlForecast Object?

José Morales

06/19/2024, 3:14 PM

Here's an example:

Copy code

import math

import lightgbm as lgb
from mlforecast import MLForecast
from mlforecast.auto import AutoMLForecast, AutoLightGBM
from mlforecast.utils import generate_series
from utilsforecast.losses import smape

series = generate_series(10, min_length=100)
auto = AutoMLForecast(
    models={'lgb': AutoLightGBM()},
    freq="D",
    season_length=7,
)
auto.fit(series, n_windows=2, h=7, num_samples=5)
best_trial = auto.results_['lgb'].best_trial
auto_res = best_trial.value
best_config = best_trial.user_attrs['config']
mlf = MLForecast(
    models={'lgb': lgb.LGBMRegressor(**best_config['model_params'])},
    freq="D",
    **best_config['mlf_init_params'],
)
cv_res = mlf.cross_validation(series, n_windows=2, h=7, refit=False)
cv_res['id_cutoff'] = cv_res['unique_id'].astype(str) + '_' + cv_res['cutoff'].astype(str)
manual_res = smape(cv_res, models=['lgb'], id_col='id_cutoff')['lgb'].mean()
assert math.isclose(auto_res, manual_res)

José Morales

06/19/2024, 3:16 PM

Is that what you're trying to reproduce? The trial score?

Sarim Zafar

06/19/2024, 3:16 PM

Yes

Sarim Zafar

06/19/2024, 3:59 PM

Okay it is working now! I must've had a logical error somewhere in my code! Thank you!

🙌 1

2 Views

Open in Slack

Previous Next