Brian Head
09/14/2023, 6:31 PMMLForecast
function using it with cross_validation. No issues with running it normally with a pandas dataframe, but when I try to run it with a spark dataframe (much like I was able to do wtih statsforecast), I get the following error: "RecursionError: maximum recursion depth exceeded in comparison". Haven't been able to find anything super helpful via google/stackoverflow, ChatGPT, nor this slack channel.
I've tried all of the algos in the code below and also limiting it to just one (several iterations).
Here's the code:
@njit
def rolling_mean_12(x):
return rolling_mean(x, window_size=12)
@njit
def rolling_mean_24(x):
return rolling_mean(x, window_size=24)
******
ML_models = [
lgb.LGBMRegressor(n_jobs=4, random_state=0, verbosity=-1),
# xgb.XGBRegressor(n_jobs=4, random_state=0),
# MLPRegressor(random_state=0, max_iter=1000, early_stopping=True, n_iter_no_change=10, tol=1e-4),
# RandomForestRegressor(random_state=0),
# ExtraTreesRegressor(random_state=0),
# HistGradientBoostingRegressor(random_state=0),
# KNeighborsRegressor()
]
mlf = MLForecast(
models = ML_models,
freq = 'M',# our series have integer timestamps, so we'll just add 1 in every timeste,
lags=[1, 12],
# lags=range(1,6, 1),
lag_transforms={
1: [expanding_mean],
12: [rolling_mean_12],
# 24: [rolling_mean_24],
},
# date_features=['year','month','quarter','days_in_month'],
target_transforms=[Differences([1, 24])]
)
#cross validation of statsforecast models using spark
ML_crossvalidation_SDF = mlf.cross_validation(
data=sdf,
window_size=3,
n_windows=5
).toPandas()
******
Note that the below does work with regular pandas dataframe:
ML_crossvalidation_df = mlf.cross_validation(
data=df2,
window_size=3,
n_windows=5,
)
Any help appreicated!José Morales
09/14/2023, 6:36 PMfrom mlforecast.distributed import DistributedMLForecast
). Please let us know if you run into any issuesBrian Head
09/14/2023, 7:07 PM