hello all ! For AutoARIMA training, would reduce `...
# statsforecast
y
hello all ! For AutoARIMA training, would reduce
nmodels
from 5 to 4 significantly impact training time? We're training AutoARIMA for 8,000 - 12,000 time series models using the AutoARIMA instance specified as below, but it takes very long time and for some instances we saw this refitting (while ARIMA is almost instantaneous)
auto_arima_model = [AutoARIMA(season_length=7, nmodels=5, trace=True)]
j
maybe also try to set n_jobs=-1 for parallel jobs, not sure what the standard setting is here?
sf = StatsForecast(
models=[
AutoARIMA(season_length=52),
AutoETS(season_length=52),
AutoCES(season_length=52),
AutoTheta(season_length=52),
],
freq="W-MON",
n_jobs=-1
)
y
@jan rathfelder thank you! we already set it to n_jobs = -1, here's our setup
Copy code
statsforecast = StatsForecast(
    models=models,
    freq="D",
    fallback_model=SeasonalNaive(season_length=7),
    n_jobs=-1,
)
j
then i am out of ideas 🙂 the only thought i have is that training several thousand models is just costly in general, even if it a simple model like arima. why not train one global model on all the series? could be faster and more accurate
b
Are you in an environment where spark is an option? I’m still working out how to optimize to work with multiple models, but I’ve got it working with autoarima nmodels=10 on ~10k series and it runs in about 20 minutes.
y
Hi @Brian Head Yes, I'm using pyspark already. Yours runs really fast, I'm not sure why mine runs so slow. What's your cluster configuration if you don't mind? Like how many worker nodes and the memory/CPU stats?
b
Right now I'm using 100 executors and 300 repartitions. I have read that you want 128 mb per worker and about 3-4 paritions per worker. However, this didn't work out well for me. It took some playing around to find the 100/300 worked. I'd also note that this is when demand (I share resources with a team) is low. It fails when others are working too, so I've ended up running this in the evenings/weekends to get it to work properly.