When dealing with multiple time-series, is the app...
# mlforecast
When dealing with multiple time-series, is the approach carried out by MLforecast equivalent to multiple independent time series, as described here? Is the one hot encode implicit or does/should it be handled explicitly by the user?
Yes, it's the multiple independent time series, but we just compute the lag features, we don't add the one hot encoding. You can achieve that by using the id as a static feature and then using a onehotencoder in your model, e.g.
Copy code
from sklearn.compose import ColumnTransformer
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

ohe = ColumnTransformer([('unique_id', OneHotEncoder(sparse_output=False))], remainder='passthrough')
ohe_lr = make_pipeline(ohe, LinearRegression())
fcst = MLForecast(models=ohe_lr, ...)
fcst.fit(..., static_features=['unique_id', ...])
Or if your model supports categorical features (like LightGBM) you can just provide the id as a static feature directly (having the id with a categorical dtype)
Keep in mind that doing the one hot encoding can use a lot of memory if you have many series
Yeah, of course. I have a few thousand series of varying lengths and dynamics at the moment so it's not massive data. I'll have a play and see what sort of improvement I get (if any)