Hello everyone I ve been working on TS for many years and i Nixtla Community #mlforecast

Hello everyone, I’ve been working on TS for many y...

Umar Mohamed

01/15/2025, 9:20 AM

Hello everyone, I’ve been working on TS for many years, and i was thinking about weighting those years, fortunately, I found weight_col parameter exists, but I got an error when using it with sklearn pipeline here’s a sample code

Copy code

pipeline = []
model = lgb.LGBMRegressor()

pipeline.append(("regressor", model))
pipe = Pipeline(pipeline)

sf = StatsForecast(models=[], freq="D", n_jobs=-1) 
model = MLForecast(models={"pipe": pipe},
                  freq='D', 
                  lags=list(range(1, 21)), 
                  lag_transforms={ 3: [RollingMean(window_size=7), RollingMean(window_size=14)], 
                                   7: [RollingMean(window_size=7), RollingMean(window_size=14)]},
                  target_transforms=[Differences([0])]
                  )
 model.fit(sample_set, static_features=[], weight_col="weight")

I got that error

Copy code

Pipeline.fit does not accept the sample_weight parameter. You can pass parameters to specific steps of your pipeline using the stepname__parameter format, e.g. Pipeline.fit(X, y, logisticregression__sample_weight=sample_weight).

I spent adequate time trying to find the problem, but it looks like a bug within the core code for an unplanned case. Do you have any idea?

Umar Mohamed

01/15/2025, 1:44 PM

I found a solution, here it is.

Copy code

from sklearn.pipeline import Pipeline
from sklearn.base import clone

class SampleWeightPipeline(Pipeline):
    def fit(self, X, y=None, **fit_params):
        """
        Fit the model with `sample_weight` support.
        
        Parameters:
        - X: Input features
        - y: Target variable
        - fit_params: Additional parameters, including `sample_weight`
        """
        sample_weight = fit_params.pop("sample_weight", None)
        
        for name, transform in self.steps[:-1]:
            if transform is not None:
                if hasattr(transform, "fit_transform"):
                    X = transform.fit_transform(X, y)
                else:
                    X = transform.fit(X, y).transform(X)
        
        # Handle sample_weight for the final estimator
        if sample_weight is not None:
            fit_params = fit_params or {}
            fit_params["sample_weight"] = sample_weight
        
        self.steps[-1][1].fit(X, y, **fit_params)
        return self
    
    def fit_transform(self, X, y=None, **fit_params):
        """
        Fit the model and return transformed data with `sample_weight` support.
        
        Parameters:
        - X: Input features
        - y: Target variable
        - fit_params: Additional parameters, including `sample_weight`
        """
        sample_weight = fit_params.pop("sample_weight", None)
        
        for name, transform in self.steps[:-1]:
            if transform is not None:
                if hasattr(transform, "fit_transform"):
                    X = transform.fit_transform(X, y)
                else:
                    X = transform.fit(X, y).transform(X)
        
        if sample_weight is not None:
            fit_params = fit_params or {}
            fit_params["sample_weight"] = sample_weight
        
        self.steps[-1][1].fit(X, y, **fit_params)
        return X

Basically, the problem is that the default pipeline of sklearn doesn’t support passing sample_weights, so you have to create custom one.

jan rathfelder

02/28/2025, 10:53 PM

perfect, exactly what i needed. thx!

4 Views

Open in Slack

Previous Next