This message was deleted Nixtla Community #general

Join Slack

This message was deleted.

# general

Slackbot

11/25/2022, 2:25 PM

This message was deleted.

Max (Nixtla)

11/25/2022, 2:26 PM

Which implementation of LightGBM are you using?

Andrei Tulbure

11/25/2022, 2:27 PM

lightgbm (?) from microsoft

Max (Nixtla)

11/25/2022, 2:28 PM

I mean are you using something like Github.com/nixtla/mlforecast or skforecast

Max (Nixtla)

11/25/2022, 2:28 PM

Andrei Tulbure

11/25/2022, 2:28 PM

Your framework of course

❤️ 1

Andrei Tulbure

11/25/2022, 2:28 PM

model = lgb.LGBMRegressor

models_ml = model(**params)

ml_fcst = Forecast(

models=models_ml,

freq="W-MON",

lags=[1, 2, 3, 4, 5, 6, 7, 8], )

ml_fcst.fit(

df_train,

id_col="unique_id",

time_col=column_name_for_date,

target_col=column_name_for_qty,)

pred_ml_df = ml_fcst.predict(weeks_to_forecast)

Andrei Tulbure

11/25/2022, 2:29 PM

sorry for the bad formatting

Max (Nixtla)

11/25/2022, 2:29 PM

@José Morales, any idea why?

José Morales

11/25/2022, 2:38 PM

Does this happen with statsforecast ets as well?

Andrei Tulbure

11/25/2022, 2:40 PM

I use only your frameworks/libs. so yes, ets from statsforecast

José Morales

11/25/2022, 2:42 PM

I'd say it's probably related to your data. Are you able to share a small sample?

Andrei Tulbure

11/25/2022, 2:43 PM

the dataset is quite small, around 300 rows, with not huge variance

Andrei Tulbure

11/25/2022, 2:43 PM

yes I can share

Andrei Tulbure

11/25/2022, 5:20 PM

@José Morales

model = ETS

models_stats = [model(**params)]

stats_fcst = StatsForecast(

df=df_train, models=models_stats, freq="W-MON", n_jobs=-1

stats_fcst.fit(df=df_train)

pred_stats_df = stats_fcst.predict(weeks_to_forecast)

Andrei Tulbure

11/25/2022, 5:20 PM

data frame :

Andrei Tulbure

11/25/2022, 5:21 PM

results

fede (nixtla) (they/them)

11/25/2022, 5:39 PM

hey @Andrei Tulbure! Which

params

are you using? The default

season_length

is 1. ETS finds the best model with and without trend. If the data does not exhibit a clear trend, it will opt for the trendless model, so the forecast will be a flat line. Although visually not very attractive, this is the expected behavior as ETS is a weighted average of past observations in its trendless and seasonality-free version. If you want to explore the seasonality of the weekly data, I recommend setting it using MSTL. For that model you can use

season_length=52

. (ETS currently does not allow season lengths greater than 24). As for LightGBM, we have seen that it is quite good at capturing seasonalities, but be sure to include the relevant features. One way to model the seasonality in the

MLForecast

library is by using the

differences

argument (https://nixtla.github.io/mlforecast/forecast.html).

Andrei Tulbure

11/28/2022, 10:36 AM

@Darius Marginean

Darius Marginean

11/28/2022, 10:56 AM

Hello all! Do you happen to know if there is a problem for ETS if you have for 5 rows for y 0 as a value?

Darius Marginean

11/28/2022, 10:56 AM

My dataset contains 260 inputs, which 4 or 5 of them contains 0 as y.

Darius Marginean

11/28/2022, 10:57 AM

And when I've tried training ETS model with the season_lengths=24 (as @fede (nixtla) (they/them)) mentioned above I get the following error: ValueError: Multiplicative seasonality is not appropriate for zero and negative values

Darius Marginean

11/28/2022, 10:58 AM

However, if the seasonal_lengths is disable I don't receive this error and the training runs end2end (with almost the same predictions as @Andrei Tulbure posted above (and by almost the same predictions I mean same values for all the predictions))

Darius Marginean

11/28/2022, 12:29 PM

Also @fede (nixtla) (they/them) I don't quite get it what this parameter differences is? Can you please provide more details about how I should set it? Thank you in advance 🙂

Andrei Tulbure

11/28/2022, 12:31 PM

@Darius Marginean As far as I reckon with differences you get the ts to be stationary like, but @fede (nixtla) (they/them) can correct me

fede (nixtla) (they/them)

11/28/2022, 5:54 PM

hey @Darius Marginean! ETS had a problem handling data with zeros. It is fixed in the latest version of StatsForecast. You can install it using

pip install -U statsforecast

pip install statsforecast>=1.3.2

. Please let me know if the problem persists after changing the version. 🙌

✅ 1

fede (nixtla) (they/them)

11/28/2022, 6:11 PM

Yes, precisely @Andrei Tulbure! With the

differences

argument, you can make the time series stationary. It is similar to the integrated component of an arima model @Darius Marginean. In the context of mlforecast, it allows considering trend and seasonality

Darius Marginean

11/29/2022, 9:54 AM

Thanks @fede (nixtla) (they/them)! Indeed the problem for ETS was because I've used statsforecast 1.3.1, after upgrading to 1.3.2 the problem didn't persist anymore!

❤️ 1

Darius Marginean

11/29/2022, 11:21 AM

Could you provide best practices for setting differences? On the dataset @Andrei Tulbure sent here for example? Thank you in advance once again, @fede (nixtla) (they/them)!

Darius Marginean

11/29/2022, 12:33 PM

And also, do you happen to know what am I missing in implementing the LGBM Classifier model for a classification task? I want to train the LGBM Classifier to predict binary classes (0 or 1) and the predictions I get are always the same (meaning all the predictions are 1s or all the predictions are 0s).

Darius Marginean

11/29/2022, 12:34 PM

model = lgb.LGBMClassifier models_ml = model(**best_hp) ml_binary_fcst = Forecast( models=models_ml, freq="W-MON", # differences = [1], ) ml_binary_fcst.fit( train_binary_df, id_col="unique_id", time_col="ds", target_col="y" ) pred_ml_binary_df = ml_binary_fcst.predict(weeks_to_forecast)

Darius Marginean

11/29/2022, 2:47 PM

I tried setting differences param like this :

differences = [1]

and here it is the output:

Darius Marginean

11/29/2022, 2:48 PM

And that's how I implemented the model :

Darius Marginean

11/29/2022, 2:48 PM

Copy code

model = lgb.LGBMRegressor
models_ml = model(**best_hp)
ml_fcst = Forecast(
    models=models_ml,
    freq="W-MON",
    differences = [1],
    lags=[i for i in range(1, weeks_to_forecast+1)],
)
ml_fcst.fit(train_df, id_col="unique_id", time_col="ds", target_col="y")
pred_ml_df = ml_fcst.predict(weeks_to_forecast)

Darius Marginean

11/29/2022, 2:50 PM

where best_hp are the best params the Optuna Search found using Bayesian Search (TPESampler) after a certain nr of trials. Any idea why the results are this bad? Am I missing something?

fede (nixtla) (they/them)

12/01/2022, 7:42 PM

hey @Darius Marginean! Maybe your data don’t exhibit a clear trend. In that case, omitting

differences

might be a good option. Also, since you’re working with weekly data, you could pass

differences=[52]

to capture yearly trends.

🙏 1

72 Views

Open in Slack

Previous Next