Can the pipeline preprocess the exogenous inputs a...
# mlforecast
m
Can the pipeline preprocess the exogenous inputs as well as it preprocesses the 'y' input ? Example: Target transforms works beautifully.
Copy code
fcst = MLForecast(
    models = models,
    freq = '15T',
    target_transforms = [LocalStandardScaler()],
    # lags = np.arange(l , l + n).tolist() + np.arange(l + 94 , l + n + 94).tolist(),
    # lag_transforms is a dictionary where the key is the lag and the value is a list of transformations to apply to that lag.
    # Here, 1: [expanding_mean] means apply the expanding_mean transformation to the lag 1.
    # And 24: [(rolling_mean, 48)] means apply the rolling_mean transformation with a window of 48 to the lag 24.
    lag_transforms = {
        l: [expanding_mean],
        l: [(rolling_mean, 192)],
    },
    num_threads = -1,
)
Is there an example / tutorial in the docs of combining this with for example PCA for the exogenous, X inputs ? Thanks : )
perhaps, I can figure it out with a class
Copy code
GlobalSklearnTransformer(BaseTargetTransform):
j
That class will only apply the transformation to the target as well. We have this issue to implement transformations on the features. We'll work on it soon. At the moment you'd have to do it manually, i.e. apply PCA to your features before calling MLForecast.fit/cv
👍 1
m
yes that is fine, it is however tricky with proper cross validation as doing it over the complete dataset might leak info in some cases.
j
I think you can use a pipeline for this, e.g.
Copy code
from sklearn.compose import ColumnTransformer
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline

pca = ColumnTransformer([('pca', PCA(), ['exog1', 'exog2'])], remainder='passthrough')
model = make_pipeline(pca, your_model)
fcst = MLForecast(model=model, ...)
nixtlablack 1
👍 2