Isaac
06/14/2023, 6:51 PMBrendan Rocks
06/14/2023, 10:28 PMMohammed Alruqimi
06/17/2023, 8:18 PMMohammed Alruqimi
06/17/2023, 8:23 PMIsaac
06/27/2023, 3:59 PMAutoARIMA
, I get a lot of warnings like UserWarning: Having 3 or more differencing operations is not recommended. Please consider reducing the total number of differences.
and RuntimeWarning: divide by zero encountered in long_scalars fit["aicc"] = fit["aic"] + 2 * npar * (npar + 1) / (nstar - npar - 1)
. Shouldn't these warnings be suppressed by using the automated version of the ARIMA?Thiago Vidigal
06/30/2023, 4:54 PMThiago Vidigal
07/06/2023, 12:28 PMAlexander
07/19/2023, 7:06 AMAssertion failed: (isInt<33>(Addend) && "Invalid page reloc value."), function encodeAddend, (...)
This seems to be a known (and open) issue of numba/LLVM on M1: https://github.com/numba/numba/issues/8567
Do you have any experiences with this problem? And maybe have an idea/roadmap for a possible fix? Any help & hints are highly appreciated 🙂 Thanks!Nasreddine D
07/26/2023, 5:55 PMFlorian W
08/01/2023, 12:02 PMNasreddine D
08/08/2023, 3:52 PMFallen Ghost (Ghosty)
08/09/2023, 3:20 AMMatej
08/13/2023, 7:51 AMNext, the trend component of the time series is computed using the last iteration of STL. On the other hand, if the time series is non-seasonal, MSTL uses the Friedman's Super Smoother function, supsmu, available in R (R Core Team, 2020), to directly estimate the trend.
In Nixtla it seems AutoArima is used and I wonder if this is the preferred set up, or perhaps can I set up the original MSTL as presented in the paper somehow? Without the AutoARIMA?Matej
08/13/2023, 2:37 PMmodels = [MSTL(
season_length=[24, 24 * 7], # seasonalities of the time series
trend_forecaster=AutoARIMA() # model used to forecast trend
)]
forecasts = sf.predict(h=24, level=[90])
forecasts.head()
I get the following exception when I include the level parameter in the predict method:
Exception: You have to instantiate either the trend forecaster class or MSTL class with `prediction_intervals` to calculate them
(Without the level=[90] parameter everything seem to work fine)
I have also tried adding the conformal intervals:
MSTL(
season_length=[24, 24*7],
trend_forecaster=AutoARIMA(),
prediction_intervals=ConformalIntervals(
h = 3, # fcst horizon
n_windows = 10 # number of CV windows, (conformal scores)
)
)
This also causes the predict method to crash with the following error:
ValueError: operands could not be broadcast together with shapes (1,24) (10,3)
I have statsforecasts 1.5.0 btw.
Thanks in advance for patience.Matej
08/15/2023, 2:26 PMsf.fit(df = Y_df)
For instance, currently the MSTL with AutoArima is not picking up any exogenous regressors. If the prediction window is only the next step clearly the AR MA lags dominate but my objective is e.g. 48 hours ahead and there the covariates might prove beneficial.
How can I do this using statsforecast package ?
Also is there some summary of fitted model similar to statsmodels? : ) Thanks so muchMichel Marinkovic
08/22/2023, 6:45 AMDiego Menezes
09/08/2023, 12:41 AMStatsForecast.plot(insample_forecasts, plot_random = False, plot_anomalies = True)
where only y is shown. No signs of the MSTL forecast or of the red circular markers pointing to where the anomalies were found. I recently bumped from statsforecast = 1.5.0 to 1.6.0. Perhaps there's a bug in the plot function?
Thanks,
Diego.Brian Head
09/13/2023, 8:51 PMBrian Head
09/29/2023, 6:25 PMforecasts_df = sf.forecast(h=3, level=[80,90,95,99], fitted = True)
Warning:
`PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()``
I've tested the same code limiting to just one level and repeated with the other leves as separate runs. I could then merge them. But, despite the warning, the code does seem to work correctly--I get the same output when I run the single levels on their own and compare.
Is this a known issue? Is it, as I am thinking, a warning I can ignore?Francisco
10/02/2023, 8:42 PMMatej
10/05/2023, 7:36 AMfrom time import time
# Get the current time before forecasting starts, this will be used to measure the execution time
init = time()
# Call the forecast method of the StatsForecast instance to predict the next 28 days (h=28)
# Level is set to [90], which means that it will compute the 90% prediction interval
fcst_df = sf.forecast(df=Y_df, h=28, level=[90])
# Get the current time after the forecasting ends
end = time()
# Calculate and print the total time taken for the forecasting in minutes
print(f'Forecast Minutes: {(end - init) / 60}')
throws:
RemoteTraceback
Exception: You must pass `prediction_intervals` to compute them.
"""
Q: How should I modify the models for this to work? Should I add the Conformal Intervals somehow. (Many statistical models have conf intervals out of box?)
Thank you and have a great day.Matej
10/07/2023, 2:38 PMmodels_stats = [
AutoCES(season_length=96),
AutoETS(season_length=96)
]
# Instantiate StatsForecast class with the models
sf = StatsForecast(
models=models_stats,
freq='15T',
n_jobs = 7 # NOTE: n_jobs instead of num_threads
)
sf.fit(df_train)
• yet I do not see more CPUs being utilized and the fitting takes very long time.
• num_threads in mlforecast works amazingly well, so I wonder maybe this is perhaps due to the nature of statsforecast algos, that they cant be parallelized as well as e.g. the lgbm ?
• (I do run the analysis in jupyter notebook but I doubt that is the culprit)
Thanks and have a great weekend.Matej
10/10/2023, 7:09 AMcv_df = sf.cross_validation(
df = df,
h = h,
level = [90],
step_size = step_size,
test_size = test_size_adjusted,
# input_size = 90*96,
n_windows = None,
refit = False,
fitted = True
)
Francisco
10/11/2023, 2:47 PMJack Fletcher
10/26/2023, 10:31 PMstatsforecast
package. I have a qq about AutoARIMA()
. If I fit AutoARIMA
to a dataframe with many unique_ids, is the best model across all unique_ids what gets returned? It seems like the answer is 'yes', but wanted to confirm. Thanks 🙂Toni Borders
10/30/2023, 9:51 AM# define model(s)
ts_models = [
AutoETS(model=['Z','Z','Z'], season_length=season, alias='AE')
]
# Instantiate StatsForecast class
model = StatsForecast(df = train,
models=ts_models,
freq = 'H',
n_jobs=-1)
# Forecast
forecast = model.forecast(h=1500, level=[95])
Error:
_‘StatsForecast’ object has no attribute ‘fitted_’_
However if I use the fit and predict methods with the code below, the forecast is successful.
# plot data
StatsForecast.plot(train, engine='plotly')
# define model(s)
ts_models = [
AutoETS(model=['Z','Z','Z'], season_length=season, alias='AE')
]
# Instantiate StatsForecast class
model = StatsForecast(
models=ts_models,
freq = 'H',
n_jobs=-1)
model.fit(train)
# Forecast
h = round((valid['ds'].nunique()) * 0.5)
forecast = model.predict(h=h, level=[95])
model.plot(df, forecast, engine='plotly')
Any ideas why the more efficient forecast method is not working in my case?
2. And a secondary issue, is that the StatsForecast.plot() is also not working when I run in a script in PyCharm.
If anyone has guidance on this second query it would be appreciated.Toni Borders
11/03/2023, 4:33 PMfrom statsforecast import StatsForecast
def fit():
model = StatsForecast(df=panel_df,
models=[AutoARIMA(), SeasonalNaive(season_length=7)],
freq='D',
n_jobs=1,
verbose=True)
model.fit(panel_df)
model.save('/Users/tmb/PycharmProjects/data-science/UFE/output_files/model')
Error:
AttributeError: type object 'StatsForecast' has no attribute 'save'
I have the latest version of the package installed (e.g. 1.6.0)Brian Head
11/08/2023, 8:58 PMChad Parmet
11/14/2023, 4:45 PMNaive()
fits on statsforecast 1.6.0 (and 1.5.0). Does this make sense to anyone? Details in 🧵Antoine SCHWARTZ -CROIX-
11/20/2023, 5:44 PMAutoETS
run).
prediction_intervals = ConformalIntervals(h=forecast_horizon, n_windows=n_windows_conformal)
df = spark.read.parquet(f"{base_s3_dir_path}/{cutoff}/{input_dfs_dir_name}/df.parquet")
futr_df = spark.read.parquet(f"{base_s3_dir_path}/{cutoff}/{input_dfs_dir_name}/futr_df.parquet")
sf = StatsForecast(
models=[eval(F"{algo}(season_length=season_length, alias=algo, prediction_intervals=prediction_intervals)")],
freq=freq,
fallback_model=SeasonalNaive(season_length=season_length),
#n_jobs=1,
#verbose=True
)
predictions = sf.forecast(
h=forecast_horizon,
df=df,
X_df=futr_df,
level=[50, 80, 90, 95, 99],
prediction_intervals=prediction_intervals,
).toPandas()
However, for AutoARIMA
the computation time was still too high for me, so I dug a little deeper into the logs, and found that only 30% of CPUs were used on average during the run.
• I've tried tweaking the input spark dfs repartition, and modifying a few spark confs, but nothing changes. Do you have any ideas?
• Is the n_jobs
parameter used when the detected backend is spark?
• Would you advise me to constrain ARIMA's search fields a little to save a bit of computing time? If so, which parameters should be edited first?
Thanks in advance!