Evan Miller
08/30/2023, 6:05 PMJosé Morales
08/30/2023, 6:33 PMMLForecast.ts
before saving: ga, _ga, static_features_, uids, last_dates, restore_idxs
. The fitted models are saved in the MLForecast.models_
attribute, so something like MLForecast.models_['LGBMRegressor'].booster_
would give you the trained booster object.
2. The assumption is that the training series are complete, so if you have a gap the lags, etc will be wrong. You may find the fill_gaps function useful, it will produce the full panel and you can then fill with any method you want.
3. All forecasts start from the respective ends of each serie, because the assumption is that those are the latest values you have and you want to forecast ahead. If you've seen new values of a serie you can use the update method as described here.
4. Not yet, this is something we have on our roadmap but at the moment you have to do it manually. This issue has a good way of doing it right now.
5. There aren't any hidden processings, we prefer not to give surprises, so you have to be explicit on what you want to happen. If you want to perform scaling for example you can use target_transforms (guide).
6. The column identifiers are mainly used to perform the feature engineering. The id and dates aren't passed to LightGBM, but you can pass the id if you specify it in the static_features argument and if your dates are integers you can specify a date_feature that is just the identity. If you're wondering in which order the features are going to be passed you can access the MLForecast.ts.features_order_
attribute after calling fit. You can also perform just the feature engineering with preprocess, then manually training the model in any way you like and then assigning it to the MLForecast.models_
attribute or using the MLForecast.ts.predict
method like here.
7. They serve the same purpose. The dynamic_dfs argument is legacy and will be removed soon, it was meant to be used for saving memory by not having repeated values in a single dataframe but the X_df is easier to reason about and makes the predict step faster.Evan Miller
08/30/2023, 6:44 PMJosé Morales
08/30/2023, 6:46 PMEvan Miller
08/30/2023, 7:00 PMname:
for column name, e.g. categorical_feature=name:c1,c2,c3
means c1, c2 and c3 are categorical features
How does this look in python? Should I have something like:
lgb_params = {
'categorical_feature': 'names: "col1", "col2"'
}José Morales
08/30/2023, 7:05 PMcategorical_feature=['col1', 'col2']
'auto'
(the default) it will use the data types of the columns and if they're categorical it will set them automatically