Nixtla Community #statsforecast

thomas delaunait

01/09/2025, 10:48 AM

@Mariana Menchero Hello Mariana and Nixtla Team. Happy new year to all of you! Thank you for your amazing work. I have quick question. Do you know when it will be planned to release the feat: Forward method TBATS/AutoTBATS ? its seems to be on the main branch since feb 2024. Thank you!

👀 1

Aravind Karunakaran

01/13/2025, 11:19 AM

Im using SimpleExponentialSmoothing to forecast my time-series data which seems to have no apparent trend or seasonality - the model fitted values are great but all the predicted values are the same (i.e the output values just form a straight line). Any explanations for this?

Makarand Batchu

01/13/2025, 12:21 PM

Hi #C05CAFFR22H team. I have recently upgraded statsforecast package version to the latest version (2.0.0) and cross_validation is now taking so much longer. Was any of the base models updated? Thanks in advance

Guillaume GALIE

01/15/2025, 4:21 PM

Hello I raise following exception with cross validation and MSTL Model (MSTL(season_length = [12],alias='MSTL'))

Copy code

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:5273, in MSTL.forward(self, y, h, X, X_future, level, fitted)
   5269         res = self.trend_forecaster._add_conformal_intervals(
   5270             fcst=res, y=x_sa, X=X, level=level
   5271         )
   5272 # reseasonalize results
-> 5273 seas_h = _predict_mstl_seas(model_, h=h, season_length=self.season_length)
   5274 seas_insample = model_.filter(regex="seasonal*").sum(axis=1).values
   5275 res = {
   5276     key: val + (seas_insample if "fitted" in key else seas_h)
   5277     for key, val in res.items()
   5278 }

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:4999, in _predict_mstl_seas(mstl_ob, h, season_length)
   4998 def _predict_mstl_seas(mstl_ob, h, season_length):
-> 4999     seascomp = _predict_mstl_components(mstl_ob, h, season_length)
   5000     return seascomp.sum(axis=1)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:4992, in _predict_mstl_components(mstl_ob, h, season_length)
   4990     mp = seasonal_periods[i]
   4991     colname = seasoncolumns[i]
-> 4992     seascomp[:, i] = np.tile(
   4993         mstl_ob[colname].values[-mp:], trunc(1 + (h - 1) / mp)
   4994     )[:h]
   4995 return seascomp

ValueError: could not broadcast input array from shape (10,) into shape (12,)

Here an example of 1 time serie to reproduce KO

Copy code

import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import (Naive,MSTL) 
from utilsforecast.data import generate_series
freq = 'MS'
season_length = 12
min_length = 25
df = generate_series(n_series=1, freq=freq, min_length=min_length, max_length=min_length)
sffcst = StatsForecast(models = [MSTL(season_length = [season_length])], freq = freq, n_jobs=-1,fallback_model=Naive(),verbose=True)
sf_crossvalidation_df=sffcst.cross_validation(df = df, h=12, step_size = 1, n_windows = 3, refit=False).reset_index(drop=True)
sf_crossvalidation_df

What is strange is that it works with less history => if you keep only 15Months of data then it doesn't crash

Copy code

import pandas as pd
from statsforecast import StatsForecast
from statsforecast.models import (Naive,MSTL) 
from utilsforecast.data import generate_series

freq = 'MS'
season_length = 12
min_length = 15

df = generate_series(n_series=1, freq=freq, min_length=min_length, max_length=min_length)

sffcst = StatsForecast(models = [MSTL(season_length = [season_length])], freq = freq, n_jobs=-1,fallback_model=Naive(),verbose=True)
sf_crossvalidation_df=sffcst.cross_validation(df = df, h=12, step_size = 1, n_windows = 3, refit=False).reset_index(drop=True)
sf_crossvalidation_df

Pradyumna Mahajan

01/16/2025, 6:54 AM

Hey team, I am using Statsforecast. I wanted to use a custom frequency of 3 hours, but there is no offset for that. What can I do? I saw some examples of freq=1 in the docs, but what does that mean? Thanks in advance 🙂

Piero Danti

01/16/2025, 2:45 PM

Hello, I would like to perform multi timeseries forecasting using #C05CAFFR22H. Is it possible? Furthermore, can I also forecast new time-series?

Naren Castellon

01/19/2025, 9:00 PM

I have the following Model, How can I save it and how can I load it once saved, to then train it again?

# Instantiate StatsForecast class as sf

sf = StatsForecast(

df = train,

models = models,

freq ='D',

n_jobs = -1)

# Train the model

sf.fit()

# Forecast

Y_hat = sf.predict(horizon)

Bersu T

02/05/2025, 10:59 AM

Hi team, I have a couple questions with regards to prediction intervals. Does ARIMA also use conformal predictions in statsforecast? Can we perform a calibration test (coverage probability), to see how well this interval is calibrated? And have you guys already compared prediction intervals across ARIMA, ML and NN timeseries? I am doing research for my thesis and plan on making comparisons in terms of sharpness and calibration of the prediction interval across different models and hierarchical reconciliation methods.

Filipa Encarnação Louzeiro

02/10/2025, 5:09 PM

Hi all, After generating windows for cross-validation, is there a way to calculate rmse (or any other loss function) knowing that my cross-validation dataframe is a spark dataframe? (I mean, beyond the obvious solution of programming the loss expression)

Simon

02/15/2025, 1:52 PM

Hello everyone! I am currently looking for a method to train an MSTL model, save it, and add new data without needing to retrain on the full dataset. Do you know if this functionality is supported in statsforecast?

Slackbot

02/20/2025, 11:54 AM

This message was deleted.

Vaibhav Gupta

02/24/2025, 3:02 AM

Hello Nixtla team, I have noticed a small error in the docs you have provided for the statsforecast library, may I know how to contribute to fixing it?

Slackbot

02/25/2025, 10:34 AM

This message was deleted.

Rodrigo Sodré

03/09/2025, 9:22 PM

Greetings everyone! I'm trying to predict the next steps of a time series using AutoArima. The data is composed of 2 year daily observation of 96 assets. This is how the dataframe looks after formatting it to Nixtla's input format (attached), 92928 rows × 3 columns. If I use

sf,forecast

everything works just fine:

pred = sf.predict(h=horizon, df=train_df)

But for every new observation I have to append it to the dataframe and call forecast, which will train everything again. So I tried to change to `sf.fit + sf.predict`:

sf.fit(df=train_df)

# update train_df

pred = sf.predict(h=horizon, X_df=train_df)

but I'm getting the following error:

Copy code

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File <timed exec>:12

File /opt/conda/lib/python3.11/site-packages/statsforecast/core.py:747, in _StatsForecast.predict(self, h, X_df, level)
    742     warnings.warn(
    743         "Prediction intervals are set but `level` was not provided. "
    744         "Predictions won't have intervals."
    745     )
    746 self._validate_exog(X_df)
--> 747 X, level = self._parse_X_level(h=h, X=X_df, level=level)
    748 if self.n_jobs == 1:
    749     fcsts, cols = self.ga.predict(fm=self.fitted_, h=h, X=X, level=level)

File /opt/conda/lib/python3.11/site-packages/statsforecast/core.py:692, in _StatsForecast._parse_X_level(self, h, X, level)
    690 expected_shape = (h * len(<http://self.ga|self.ga>), self.ga.data.shape[1] + 1)
    691 if X.shape != expected_shape:
--> 692     raise ValueError(
    693         f"Expected X to have shape {expected_shape}, but got {X.shape}"
    694     )
    695 processed = ufp.process_df(X, self.id_col, self.time_col, None)
    696 return GroupedArray(processed.data, processed.indptr), level

ValueError: Expected X to have shape (96, 2), but got (92928, 3)

"*Expected X to have shape (96, 2), but got (92928, 3)*" I don't understand why the shape isn't valid for this case if it's the same I trained b4 and it's working with

forecast

. Am I using it incorrectly? Is there a proper way to call

fit/predict

? Thanks in advance.

Simon

03/10/2025, 7:31 PM

Hi all, is it possible to find the

unique_id

to which an MSTL fitted model corresponds? For the following, I do not find any identification:

Copy code

sf.fitted_[0, 0].model_

Thank you for any hints!

Ankit Hemant Lade

03/17/2025, 3:05 PM

Hey @Marco I am trying to extract the parameters for stats module such as alpha, beta, gamma etc. Currently, I am doing it for AutoTheta but I am not able to access any parameters from stats is there any way?

Bersu T

03/18/2025, 2:56 PM

Hi! Do I need to instantiate conformal prediction for ARIMA during training, or can I also just apply it only during prediction?

Sergio André López Pereo

03/26/2025, 3:35 AM

Hey! Good night everyone. I have a question related with the implementation. Do you have any kind of document or article about the time complexity of the AutoTBATS model in terms of the seasons array?

03/26/2025, 12:34 PM

Hi, Does Statsforecast support VAR models? I mean, I have multiple variables (columns) in the dataset needing TSA.

Alex Berry

03/31/2025, 5:30 PM

What might cause prediction intervals to look this jagged? I am getting intervals similar to this when using HoltWinters and ARIMA models. The lower and upper bounds are the 2.5 and 97.5 percentiles, respectively.

Santosh Srivatsa

04/08/2025, 2:54 AM

Hello everyone, I’m using AutoARIMA from Nixtla’s StatsForecast to detect missed transmissions from multiple data sources, each identified by a unique

unique_id

. Here’s a quick overview of my workflow: 1. Data Preparation: ◦ I create two DataFrames (train and test), both with columns:

unique_id

ds

, and

. ◦ The train and test DataFrames have the same end date but different start dates. 2. Model Setup and Fitting: ◦ I initialize the

StatsForecast

object with the

AutoARIMA

model, setting parameters like

season_length

and

freq

. ◦ I call the

.fit()

method on the training DataFrame. 3. Forecasting and In-Sample Predictions: ◦ I forecast using the

.forecast(h=1)

method on the test DataFrame. ◦ I also use

.forecast_fitted_values()

after fitting to retrieve in-sample predictions, and I flag anomalies based on the
level
prediction intervals by checking whether actuals fall outside the expected range. I’m doing this because I’m specifically trying to detect missed transmissions, meaning there may not be a value at a fixed point in the future—so a direct forecast of future values isn’t always meaningful. Instead, I’m comparing the model’s in-sample expectations against actuals. Additionally, when I try to introduce exogenous variables, I’m running into an invalid shape error when calling

.forecast()

. I suspect this might be because the test dataset doesn’t have the same number of rows for each

unique_id

. Would love any guidance on whether this overall workflow makes sense or how I might improve it—especially around incorporating exogenous variables or detecting anomalies more robustly. Thanks in advance for your insights!

Sergio André López Pereo

04/09/2025, 7:39 PM

Hello nixtla team, and thanks in advance for your time. I'm running into some kind of issue using the AutoTBATS model. There's some cases where the model just kinda "explodes". It makes the prediction and it seems it transformed it into something exponential for some kind of reason. There's explosion in the positive values and also in the negatives. Do you have any kind of hint of what would it be?

Mariana Menchero

04/09/2025, 7:57 PM

Hi @Sergio André López Pereo do you have a reproducible example you can share with us?

Iching Quares

04/10/2025, 2:49 PM

Hello Nixtla team, I'm not sure if I'm missing anything, but is there any way to jointly fit a arima+garch model, similar on how it's done in the rugarch R package

Copy code

spec <- ugarchspec(variance.model = list(garchOrder = c(1, 1)), 
                     mean.model = list(armaOrder = c(final.order[1], final.order[3]), include.mean = TRUE), 
                     distribution.model = "std", 
                     fixed.pars = fixed_pars_df0)

Ankit Hemant Lade

04/11/2025, 2:38 AM

In statsforecast cross validation is there any way i can give explicit cut off date?

IHAS

04/11/2025, 3:36 PM

I am using StatsForecast to process over 100 time series... Is there a way to enable a verbose mode to track which series is currently being processed, or at least estimate the remaining time for the entire process?

Sai krishna Sirikonda

05/02/2025, 9:58 AM

Hi all, I am working on hierarchical forecasting and I want to know the four to five models which can give better results comparatively.

Sai krishna Sirikonda

05/05/2025, 5:32 AM

Hi all, can anyone help me determine when to perform the cross-validation step in hierarchical forecasting—before or after reconciliation?

Filipa Encarnação Louzeiro

05/06/2025, 12:01 PM

Hi everyone, Today i got the strangest error in a code that used to run without problems. The code (in PySpark, Databricks) is:

Copy code

models = [AutoETS()] 

fcst = StatsForecast(models=models,
                     freq='M',
                     n_jobs=1,
                     fallback_model = SeasonalNaive(season_length = 12))

# FORECAST
df_pred = fcst.forecast(df = df_train, 
                        h = 1,
                        fitted = True, 
                        level=[90])

But then this error showed up:

Copy code

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue_spark/execution_engine.py", line 228, in _udf_pandas
    output_df = map_func(cursor, input_df)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/extensions/_builtins/processors.py", line 333, in run
    return self.transformer.transform(df)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/extensions/transformer/convert.py", line 346, in transform
    return self._wrapper.run(
           ^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/dataframe/function_wrapper.py", line 103, in run
    rt = self._func(**rargs)
         ^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/distributed/fugue.py", line 166, in _forecast_noX_fitted
    model, result = self._forecast(
                    ^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/distributed/fugue.py", line 109, in _forecast
    result = model.forecast(
             ^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/core.py", line 864, in forecast
    res_fcsts = self.ga.forecast(
                ^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/core.py", line 199, in forecast
    res_i = fallback_model.forecast(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/models.py", line 3844, in forecast
    res = _add_fitted_pi(res=res, se=sigma, level=level)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/models.py", line 62, in _add_fitted_pi
    lo = res["fitted"].reshape(-1, 1) - quantiles * se.reshape(-1, 1)
                                                    ^^^^^^^^^^
AttributeError: 'int' object has no attribute 'reshape'
File <command-993558187105440>, line 14
      4 fcst = StatsForecast(models=models,
      5                      freq='M',
      6                      n_jobs=1,
      7                      fallback_model = SeasonalNaive(season_length = 12))
      9 # fcst = StatsForecast(models=models,
     10 #                      freq='M',
     11 #                      n_jobs=1)
     12 
     13 # FORECAST
---> 14 df_pred = fcst.forecast(df = df_train, 
     15                         h = 1,
     16                         fitted = True, 
     17                         level=[90])\
     18     .withColumnRenamed('unique_id','CCLI_ID')\
     19     .withColumnRenamed('ds','date')

File /databricks/spark/python/pyspark/sql/connect/client/core.py:2155, in SparkConnectClient._handle_rpc_error(self, rpc_error)
   2140                 raise Exception(
   2141                     "Python versions in the Spark Connect client and server are different. "
   2142                     "To execute user-defined functions, client and server should have the "
   (...)
   2151                     "<https://docs.databricks.com/en/release-notes/serverless.html>" target="_blank" rel="noopener noreferrer"><https://docs.databricks.com/en/release-notes/serverless.html></a>.</span><span>"</span>
   2152                 )
   2153             # END-EDGE
-> 2155             raise convert_exception(
   2156                 info,
   2157                 status.message,
   2158                 self._fetch_enriched_error(info),
   2159                 self._display_server_stack_trace(),
   2160             ) from None
   2162     raise SparkConnectGrpcException(status.message) from None
   2163 else:

It's really confusing to me. The Databricks assistant suggested two different things, in different runs. The first is: "The error occurs because the fallback_model in the StatsForecast is returning an integer instead of an array, which causes the reshape method to fail. To fix this, ensure that the fallback_model returns an array-like object that can be reshaped." Then, in a second run, it suggested: "The error occurs because the se variable is an integer, and the reshape method is being called on it, which is not valid. To fix this, ensure that se is a numpy array before calling reshape." Can anyone help? I'm sort of in panic with this 😬 Many many thanks!! Meanwhile, i removed the line

fitted = True

and it worked well. But what if i really need the fitted values?

Sai krishna Sirikonda

05/08/2025, 5:20 AM

Hi all, I have a query regarding the storage of hierarchical time series forecasting models. Specifically, is it possible to save these models efficiently? I attempted to save the model as a bundle rather than as an individual model object using the following approach: model_bundle = { 'fcst': fcst_full, # Trained forecasting model(s) 'hrec': hrec, # Reconciliation logic 'S_df': S_df, # Hierarchy matrix 'tags': tags, # Hierarchy metadata 'Y_df': Y_df, # Full dataset 'Y_hat_df': Y_hat_full_df, # Base forecasts 'Y_fitted_df': Y_fitted_full_df, # Fitted values 'Y_rec_df': Y_rec_full_df, # Reconciled forecasts 'evaluation': evaluation # Metrics } joblib.dump(model_bundle, 'hierarchical_forecast_bundle.joblib') Additionally, I noticed that the

neuralforecast

library offers dedicated methods for saving and loading models. Does the

hierarchicalforecast

library provide similar functionality for storing hierarchical forecasting models? I would appreciate any insights on best practices for saving and restoring hierarchical forecasting models. #save nf.save(path='./checkpoints/test_run/', model_index=None, overwrite=True, save_dataset=True) #load nf2 = NeuralForecast.load(path='./checkpoints/test_run/') Y_hat_df2 = nf2.predict() Y_hat_df2.head() Thank you in advance for your support!