https://github.com/nixtla logo
#mlforecast
Title
# mlforecast
j

Jason Gofford

10/02/2023, 4:32 PM
When dealing with multiple time-series, is the approach carried out by MLforecast equivalent to multiple independent time series, as described here? Is the one hot encode implicit or does/should it be handled explicitly by the user?
j

José Morales

10/02/2023, 4:45 PM
Yes, it's the multiple independent time series, but we just compute the lag features, we don't add the one hot encoding. You can achieve that by using the id as a static feature and then using a onehotencoder in your model, e.g.
Copy code
from sklearn.compose import ColumnTransformer
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

ohe = ColumnTransformer([('unique_id', OneHotEncoder(sparse_output=False))], remainder='passthrough')
ohe_lr = make_pipeline(ohe, LinearRegression())
fcst = MLForecast(models=ohe_lr, ...)
fcst.fit(..., static_features=['unique_id', ...])
Or if your model supports categorical features (like LightGBM) you can just provide the id as a static feature directly (having the id with a categorical dtype)
Keep in mind that doing the one hot encoding can use a lot of memory if you have many series
j

Jason Gofford

10/02/2023, 4:53 PM
Yeah, of course. I have a few thousand series of varying lengths and dynamics at the moment so it's not massive data. I'll have a play and see what sort of improvement I get (if any)