Hi. Anybody had any issues with LightGBM regressor...
# general
a
Hi. Anybody had any issues with LightGBM regressor/ETS returning the same values for each forecasting step ?
m
Which implementation of LightGBM are you using?
a
lightgbm (?) from microsoft
m
I mean are you using something like Github.com/nixtla/mlforecast or skforecast
?
a
Your framework of course
❤️ 1
model = lgb.LGBMRegressor
models_ml = model(**params)
ml_fcst = Forecast(
models=models_ml,
freq="W-MON",
lags=[1, 2, 3, 4, 5, 6, 7, 8], )
ml_fcst.fit(
df_train,
id_col="unique_id",
time_col=column_name_for_date,
target_col=column_name_for_qty,)
pred_ml_df = ml_fcst.predict(weeks_to_forecast)
sorry for the bad formatting
m
@José Morales, any idea why?
j
Does this happen with statsforecast ets as well?
a
I use only your frameworks/libs. so yes, ets from statsforecast
j
I'd say it's probably related to your data. Are you able to share a small sample?
a
the dataset is quite small, around 300 rows, with not huge variance
yes I can share
@José Morales
model = ETS
models_stats = [model(**params)]
stats_fcst = StatsForecast(
df=df_train, models=models_stats, freq="W-MON", n_jobs=-1
)
stats_fcst.fit(df=df_train)
pred_stats_df = stats_fcst.predict(weeks_to_forecast)
data frame :
results
f
hey @Andrei Tulbure! Which
params
are you using? The default
season_length
is 1. ETS finds the best model with and without trend. If the data does not exhibit a clear trend, it will opt for the trendless model, so the forecast will be a flat line. Although visually not very attractive, this is the expected behavior as ETS is a weighted average of past observations in its trendless and seasonality-free version. If you want to explore the seasonality of the weekly data, I recommend setting it using MSTL. For that model you can use
season_length=52
. (ETS currently does not allow season lengths greater than 24). As for LightGBM, we have seen that it is quite good at capturing seasonalities, but be sure to include the relevant features. One way to model the seasonality in the
MLForecast
library is by using the
differences
argument (https://nixtla.github.io/mlforecast/forecast.html).
a
@Darius Marginean
d
Hello all! Do you happen to know if there is a problem for ETS if you have for 5 rows for y 0 as a value?
My dataset contains 260 inputs, which 4 or 5 of them contains 0 as y.
And when I've tried training ETS model with the season_lengths=24 (as @fede (nixtla) (they/them)) mentioned above I get the following error: ValueError: Multiplicative seasonality is not appropriate for zero and negative values
However, if the seasonal_lengths is disable I don't receive this error and the training runs end2end (with almost the same predictions as @Andrei Tulbure posted above (and by almost the same predictions I mean same values for all the predictions))
Also @fede (nixtla) (they/them) I don't quite get it what this parameter differences is? Can you please provide more details about how I should set it? Thank you in advance 🙂
a
@Darius Marginean As far as I reckon with differences you get the ts to be stationary like, but @fede (nixtla) (they/them) can correct me
f
hey @Darius Marginean! ETS had a problem handling data with zeros. It is fixed in the latest version of StatsForecast. You can install it using
pip install -U statsforecast
or
pip install statsforecast>=1.3.2
. Please let me know if the problem persists after changing the version. 🙌
1
Yes, precisely @Andrei Tulbure! With the
differences
argument, you can make the time series stationary. It is similar to the integrated component of an arima model @Darius Marginean. In the context of mlforecast, it allows considering trend and seasonality
d
Thanks @fede (nixtla) (they/them)! Indeed the problem for ETS was because I've used statsforecast 1.3.1, after upgrading to 1.3.2 the problem didn't persist anymore!
❤️ 1
Could you provide best practices for setting differences? On the dataset @Andrei Tulbure sent here for example? Thank you in advance once again, @fede (nixtla) (they/them)!
And also, do you happen to know what am I missing in implementing the LGBM Classifier model for a classification task? I want to train the LGBM Classifier to predict binary classes (0 or 1) and the predictions I get are always the same (meaning all the predictions are 1s or all the predictions are 0s).
model = lgb.LGBMClassifier models_ml = model(**best_hp) ml_binary_fcst = Forecast( models=models_ml, freq="W-MON", # differences = [1], ) ml_binary_fcst.fit( train_binary_df, id_col="unique_id", time_col="ds", target_col="y" ) pred_ml_binary_df = ml_binary_fcst.predict(weeks_to_forecast)
I tried setting differences param like this :
differences = [1]
and here it is the output:
And that's how I implemented the model :
Copy code
model = lgb.LGBMRegressor
models_ml = model(**best_hp)
ml_fcst = Forecast(
    models=models_ml,
    freq="W-MON",
    differences = [1],
    lags=[i for i in range(1, weeks_to_forecast+1)],
)
ml_fcst.fit(train_df, id_col="unique_id", time_col="ds", target_col="y")
pred_ml_df = ml_fcst.predict(weeks_to_forecast)
where best_hp are the best params the Optuna Search found using Bayesian Search (TPESampler) after a certain nr of trials. Any idea why the results are this bad? Am I missing something?
f
hey @Darius Marginean! Maybe your data don’t exhibit a clear trend. In that case, omitting
differences
might be a good option. Also, since you’re working with weekly data, you could pass
differences=[52]
to capture yearly trends.
🙏 1