I also found a issue using lgbmcv So basically I have fixed Nixtla Community #mlforecast

I also found a issue using lgbmcv. So basically I ...

04/09/2025, 8:16 PM

I also found a issue using lgbmcv. So basically I have fixed lgbm params and fed into lgbmcv, and I got pretty decent forecast (screenshot1), but if I fit another lgbm model from lgbmcv by

MLForecast.from_cv()

I got zig zaggy predictions (screenshot2). I'm not sure this indicates anything but I can't find a way to explain these discrepancy as all the boosters in lgbmcv gave me the same "shape".

04/10/2025, 10:23 PM

Following up on this. I did a bit more experiments that I use lgbmRegressor without MLForecast, with the same settings (lgbm params) and features, I have very different outcome. I even exclude all the lag/date features and still seeing big discrepancies. Any insight for this? All I can think of is there's something different in defaults but to my understanding MLForecast is just a wrapper of LGBMRegressor

José Morales

04/10/2025, 10:31 PM

can you provide an example? I have no idea what you're doing

04/10/2025, 10:37 PM

I'm basically doing as follows to compare -

Copy code

import numpy as np
import pandas as pd
from mlforecast import MLForecast
import lightgbm as lgb

lgb_params = {
    'boosting_type': 'gbdt',
    'num_leaves': 2**8-1,
    'subsample': 0.5,
    'subsample_freq': 1,
    'learning_rate': 0.01,
    'n_estimators': 3000,
    'verbose': -1,
}
model1 = lgb.LGBMRegressor(**lgb_params, verbosity=-1)
model1.fit(x_train, y_train)

model2  = MLForecast(
        models={
                'avg': lgb.LGBMRegressor(**lgb_params),
            },
            freq=['1h'],
            lags=[],
            lag_transforms=[],
            date_features=[]
        )
model2.fit(df, static_features=[])


model1.predict(x_test)
model2.predict(h=len(x_test), x_test)

04/10/2025, 10:38 PM

model1 and model2 give me different results

José Morales

04/10/2025, 10:45 PM

if you can provide something that runs I can look into it further, but this passes:

Copy code

import lightgbm as lgb
import numpy as np
from mlforecast import MLForecast
from utilsforecast.data import generate_series
from utilsforecast.feature_engineering import fourier

freq = 'D'
h = 5
series = generate_series(2, freq=freq)
train, future = fourier(series, k=1, freq=freq, h=h, season_length=7)
x_train = train.drop(columns=['unique_id', 'ds', 'y'])
y_train = train['y']
x_test = future.drop(columns=['unique_id', 'ds'])

lgb_params = {
    'boosting_type': 'gbdt',
    'num_leaves': 2**8-1,
    'subsample': 0.5,
    'subsample_freq': 1,
    'learning_rate': 0.01,
    'n_estimators': 3000,
    'verbose': -1,
}
model1 = lgb.LGBMRegressor(**lgb_params, verbosity=-1)
model1.fit(x_train, y_train)
model2  = MLForecast(
    models={'avg': lgb.LGBMRegressor(**lgb_params)},
    freq=freq,
)
model2.fit(train, static_features=[])
np.testing.assert_allclose(
    model1.predict(x_test),
    model2.predict(h=h, X_df=future)['avg'],
)

04/10/2025, 10:49 PM

yeah I will do that. It will take me a bit to clean up things and ready for you to test. I'll let you know here once I'm done

Open in Slack

Previous Next