Slackbot
02/09/2024, 3:23 PMJosé Morales
02/09/2024, 4:23 PMmlf = MLForecast(models=[LinearRegression()], freq='MS', lags=[1, 2, 3])
mlf.cross_validation(df, h=18, n_windows=3)
I suspect you may have gaps in your series, so when CV is performed to get the prediction intervals that error is raisedNaren Castellon
02/09/2024, 4:46 PMCross validation result produced less results than expected. Please verify that the frequency set on the MLForecast constructor matches your series' and that there aren't any missing periods.
José Morales
02/09/2024, 4:47 PMfrom utilsforecast.preprocessing import fill_gaps
filled = fill_gaps(df, freq='MS', start='per_serie', end='per_serie')
print(df.shape[0], filled.shape[0])
If those two numbers are different then you have gaps in your seriesNaren Castellon
02/09/2024, 4:55 PMfill_gaps(df, freq='MS')
to see the null values, impute them separately, the model works well, the idea is to apply the make_pipeline
with the SimpleImputer
parameter and see if it does the job correctly.José Morales
02/09/2024, 4:56 PMNaren Castellon
02/09/2024, 6:09 PMfilled = fill_gaps(df, freq='MS', start='per_series', end='per_series')
the missing data, and when I want to see the processing with mlf.preprocess(df)
it sends me the error:
ValueError: and column contains null values.
The Mlforecast and statsforecast method in its construction before the fit() method does not allow null values in the target and therefore when entering the model it will give an errorJosé Morales
02/09/2024, 6:12 PMNaren Castellon
02/09/2024, 6:21 PMestimators= make_pipeline(linear_preprocessor, RandomForestRegressor(random_state=42))
stacking_regressor = StackingRegressor(estimators=estimators, final_estimator=RidgeCV())
Before you get here or when you do the step that would be the last step in building the model, there should no longer be null data.
mlf = MLForecast(models=stacking_regressor,
freq='MS',
lags=[1,2,3],
lag_transforms={1: [expanding_mean],7: [(rolling_mean, 7)] },
#target_transforms=[ Differences([1]), LocalStandardScaler ()],
date_features=["year", "month", "day"],
num_threads=2
)
Now if the problem of null data is not resolved here when I apply the
mlf.fit(df, fitted)
It will send me the error:
ValueError: y column contains null values.
José Morales
02/09/2024, 6:26 PMJosé Morales
02/09/2024, 6:26 PM