I also found a issue using lgbmcv. So basically I ...
# mlforecast
s
I also found a issue using lgbmcv. So basically I have fixed lgbm params and fed into lgbmcv, and I got pretty decent forecast (screenshot1), but if I fit another lgbm model from lgbmcv by
MLForecast.from_cv()
I got zig zaggy predictions (screenshot2). I'm not sure this indicates anything but I can't find a way to explain these discrepancy as all the boosters in lgbmcv gave me the same "shape".
Following up on this. I did a bit more experiments that I use lgbmRegressor without MLForecast, with the same settings (lgbm params) and features, I have very different outcome. I even exclude all the lag/date features and still seeing big discrepancies. Any insight for this? All I can think of is there's something different in defaults but to my understanding MLForecast is just a wrapper of LGBMRegressor
j
can you provide an example? I have no idea what you're doing
s
I'm basically doing as follows to compare -
Copy code
import numpy as np
import pandas as pd
from mlforecast import MLForecast
import lightgbm as lgb

lgb_params = {
    'boosting_type': 'gbdt',
    'num_leaves': 2**8-1,
    'subsample': 0.5,
    'subsample_freq': 1,
    'learning_rate': 0.01,
    'n_estimators': 3000,
    'verbose': -1,
}
model1 = lgb.LGBMRegressor(**lgb_params, verbosity=-1)
model1.fit(x_train, y_train)

model2  = MLForecast(
        models={
                'avg': lgb.LGBMRegressor(**lgb_params),
            },
            freq=['1h'],
            lags=[],
            lag_transforms=[],
            date_features=[]
        )
model2.fit(df, static_features=[])


model1.predict(x_test)
model2.predict(h=len(x_test), x_test)
model1 and model2 give me different results
j
if you can provide something that runs I can look into it further, but this passes:
Copy code
import lightgbm as lgb
import numpy as np
from mlforecast import MLForecast
from utilsforecast.data import generate_series
from utilsforecast.feature_engineering import fourier

freq = 'D'
h = 5
series = generate_series(2, freq=freq)
train, future = fourier(series, k=1, freq=freq, h=h, season_length=7)
x_train = train.drop(columns=['unique_id', 'ds', 'y'])
y_train = train['y']
x_test = future.drop(columns=['unique_id', 'ds'])

lgb_params = {
    'boosting_type': 'gbdt',
    'num_leaves': 2**8-1,
    'subsample': 0.5,
    'subsample_freq': 1,
    'learning_rate': 0.01,
    'n_estimators': 3000,
    'verbose': -1,
}
model1 = lgb.LGBMRegressor(**lgb_params, verbosity=-1)
model1.fit(x_train, y_train)
model2  = MLForecast(
    models={'avg': lgb.LGBMRegressor(**lgb_params)},
    freq=freq,
)
model2.fit(train, static_features=[])
np.testing.assert_allclose(
    model1.predict(x_test),
    model2.predict(h=h, X_df=future)['avg'],
)
s
yeah I will do that. It will take me a bit to clean up things and ready for you to test. I'll let you know here once I'm done