https://github.com/nixtla logo
Join Slack
Powered by
# statsforecast
  • t

    thomas delaunait

    01/09/2025, 10:48 AM
    @Mariana Menchero Hello Mariana and Nixtla Team. Happy new year to all of you! Thank you for your amazing work. I have quick question. Do you know when it will be planned to release the feat: Forward method TBATS/AutoTBATS ? its seems to be on the main branch since feb 2024. Thank you!
    👀 1
  • a

    Aravind Karunakaran

    01/13/2025, 11:19 AM
    Im using SimpleExponentialSmoothing to forecast my time-series data which seems to have no apparent trend or seasonality - the model fitted values are great but all the predicted values are the same (i.e the output values just form a straight line). Any explanations for this?
    j
    • 2
    • 1
  • m

    Makarand Batchu

    01/13/2025, 12:21 PM
    Hi #C05CAFFR22H team. I have recently upgraded statsforecast package version to the latest version (2.0.0) and cross_validation is now taking so much longer. Was any of the base models updated? Thanks in advance
    j
    • 2
    • 14
  • g

    Guillaume GALIE

    01/15/2025, 4:21 PM
    Hello I raise following exception with cross validation and MSTL Model (MSTL(season_length = [12],alias='MSTL'))
    Copy code
    File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:5273, in MSTL.forward(self, y, h, X, X_future, level, fitted)
       5269         res = self.trend_forecaster._add_conformal_intervals(
       5270             fcst=res, y=x_sa, X=X, level=level
       5271         )
       5272 # reseasonalize results
    -> 5273 seas_h = _predict_mstl_seas(model_, h=h, season_length=self.season_length)
       5274 seas_insample = model_.filter(regex="seasonal*").sum(axis=1).values
       5275 res = {
       5276     key: val + (seas_insample if "fitted" in key else seas_h)
       5277     for key, val in res.items()
       5278 }
    
    File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:4999, in _predict_mstl_seas(mstl_ob, h, season_length)
       4998 def _predict_mstl_seas(mstl_ob, h, season_length):
    -> 4999     seascomp = _predict_mstl_components(mstl_ob, h, season_length)
       5000     return seascomp.sum(axis=1)
    
    File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\statsforecast\models.py:4992, in _predict_mstl_components(mstl_ob, h, season_length)
       4990     mp = seasonal_periods[i]
       4991     colname = seasoncolumns[i]
    -> 4992     seascomp[:, i] = np.tile(
       4993         mstl_ob[colname].values[-mp:], trunc(1 + (h - 1) / mp)
       4994     )[:h]
       4995 return seascomp
    
    ValueError: could not broadcast input array from shape (10,) into shape (12,)
    Here an example of 1 time serie to reproduce KO
    Copy code
    import pandas as pd
    from statsforecast import StatsForecast
    from statsforecast.models import (Naive,MSTL) 
    from utilsforecast.data import generate_series
    freq = 'MS'
    season_length = 12
    min_length = 25
    df = generate_series(n_series=1, freq=freq, min_length=min_length, max_length=min_length)
    sffcst = StatsForecast(models = [MSTL(season_length = [season_length])], freq = freq, n_jobs=-1,fallback_model=Naive(),verbose=True)
    sf_crossvalidation_df=sffcst.cross_validation(df = df, h=12, step_size = 1, n_windows = 3, refit=False).reset_index(drop=True)
    sf_crossvalidation_df
    What is strange is that it works with less history => if you keep only 15Months of data then it doesn't crash
    Copy code
    import pandas as pd
    from statsforecast import StatsForecast
    from statsforecast.models import (Naive,MSTL) 
    from utilsforecast.data import generate_series
    
    freq = 'MS'
    season_length = 12
    min_length = 15
    
    df = generate_series(n_series=1, freq=freq, min_length=min_length, max_length=min_length)
    
    sffcst = StatsForecast(models = [MSTL(season_length = [season_length])], freq = freq, n_jobs=-1,fallback_model=Naive(),verbose=True)
    sf_crossvalidation_df=sffcst.cross_validation(df = df, h=12, step_size = 1, n_windows = 3, refit=False).reset_index(drop=True)
    sf_crossvalidation_df
  • p

    Pradyumna Mahajan

    01/16/2025, 6:54 AM
    Hey team, I am using Statsforecast. I wanted to use a custom frequency of 3 hours, but there is no offset for that. What can I do? I saw some examples of freq=1 in the docs, but what does that mean? Thanks in advance 🙂
    j
    • 2
    • 2
  • p

    Piero Danti

    01/16/2025, 2:45 PM
    Hello, I would like to perform multi timeseries forecasting using #C05CAFFR22H. Is it possible? Furthermore, can I also forecast new time-series?
    m
    • 2
    • 1
  • n

    Naren Castellon

    01/19/2025, 9:00 PM
    I have the following Model, How can I save it and how can I load it once saved, to then train it again?
    # Instantiate StatsForecast class as sf
    sf = StatsForecast(
    df = train,
    models = models,
    freq ='D',
    n_jobs = -1)
    # Train the model
    sf.fit()
    # Forecast
    Y_hat = sf.predict(horizon)
  • b

    Bersu T

    02/05/2025, 10:59 AM
    Hi team, I have a couple questions with regards to prediction intervals. Does ARIMA also use conformal predictions in statsforecast? Can we perform a calibration test (coverage probability), to see how well this interval is calibrated? And have you guys already compared prediction intervals across ARIMA, ML and NN timeseries? I am doing research for my thesis and plan on making comparisons in terms of sharpness and calibration of the prediction interval across different models and hierarchical reconciliation methods.
    j
    • 2
    • 2
  • f

    Filipa Encarnação Louzeiro

    02/10/2025, 5:09 PM
    Hi all, After generating windows for cross-validation, is there a way to calculate rmse (or any other loss function) knowing that my cross-validation dataframe is a spark dataframe? (I mean, beyond the obvious solution of programming the loss expression)
    j
    • 2
    • 3
  • s

    Simon

    02/15/2025, 1:52 PM
    Hello everyone! I am currently looking for a method to train an MSTL model, save it, and add new data without needing to retrain on the full dataset. Do you know if this functionality is supported in statsforecast?
    j
    • 2
    • 2
  • s

    Slackbot

    02/20/2025, 11:54 AM
    This message was deleted.
    j
    • 2
    • 4
  • v

    Vaibhav Gupta

    02/24/2025, 3:02 AM
    Hello Nixtla team, I have noticed a small error in the docs you have provided for the statsforecast library, may I know how to contribute to fixing it?
    j
    • 2
    • 7
  • s

    Slackbot

    02/25/2025, 10:34 AM
    This message was deleted.
    c
    • 2
    • 1
  • r

    Rodrigo Sodré

    03/09/2025, 9:22 PM
    Greetings everyone! I'm trying to predict the next steps of a time series using AutoArima. The data is composed of 2 year daily observation of 96 assets. This is how the dataframe looks after formatting it to Nixtla's input format (attached), 92928 rows × 3 columns. If I use
    sf,forecast
    everything works just fine:
    pred = sf.predict(h=horizon, df=train_df)
    But for every new observation I have to append it to the dataframe and call forecast, which will train everything again. So I tried to change to `sf.fit + sf.predict`:
    sf.fit(df=train_df)
    # update train_df
    pred = sf.predict(h=horizon, X_df=train_df)
    but I'm getting the following error:
    Copy code
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    File <timed exec>:12
    
    File /opt/conda/lib/python3.11/site-packages/statsforecast/core.py:747, in _StatsForecast.predict(self, h, X_df, level)
        742     warnings.warn(
        743         "Prediction intervals are set but `level` was not provided. "
        744         "Predictions won't have intervals."
        745     )
        746 self._validate_exog(X_df)
    --> 747 X, level = self._parse_X_level(h=h, X=X_df, level=level)
        748 if self.n_jobs == 1:
        749     fcsts, cols = self.ga.predict(fm=self.fitted_, h=h, X=X, level=level)
    
    File /opt/conda/lib/python3.11/site-packages/statsforecast/core.py:692, in _StatsForecast._parse_X_level(self, h, X, level)
        690 expected_shape = (h * len(<http://self.ga|self.ga>), self.ga.data.shape[1] + 1)
        691 if X.shape != expected_shape:
    --> 692     raise ValueError(
        693         f"Expected X to have shape {expected_shape}, but got {X.shape}"
        694     )
        695 processed = ufp.process_df(X, self.id_col, self.time_col, None)
        696 return GroupedArray(processed.data, processed.indptr), level
    
    ValueError: Expected X to have shape (96, 2), but got (92928, 3)
    "*Expected X to have shape (96, 2), but got (92928, 3)*" I don't understand why the shape isn't valid for this case if it's the same I trained b4 and it's working with
    forecast
    . Am I using it incorrectly? Is there a proper way to call
    fit/predict
    ? Thanks in advance.
    c
    • 2
    • 3
  • s

    Simon

    03/10/2025, 7:31 PM
    Hi all, is it possible to find the
    unique_id
    to which an MSTL fitted model corresponds? For the following, I do not find any identification:
    Copy code
    sf.fitted_[0, 0].model_
    Thank you for any hints!
    m
    • 2
    • 2
  • a

    Ankit Hemant Lade

    03/17/2025, 3:05 PM
    Hey @Marco I am trying to extract the parameters for stats module such as alpha, beta, gamma etc. Currently, I am doing it for AutoTheta but I am not able to access any parameters from stats is there any way?
    m
    • 2
    • 6
  • b

    Bersu T

    03/18/2025, 2:56 PM
    Hi! Do I need to instantiate conformal prediction for ARIMA during training, or can I also just apply it only during prediction?
    j
    • 2
    • 1
  • s

    Sergio André López Pereo

    03/26/2025, 3:35 AM
    Hey! Good night everyone. I have a question related with the implementation. Do you have any kind of document or article about the time complexity of the AutoTBATS model in terms of the seasons array?
    o
    m
    • 3
    • 3
  • g

    GR

    03/26/2025, 12:34 PM
    Hi, Does Statsforecast support VAR models? I mean, I have multiple variables (columns) in the dataset needing TSA.
    o
    • 2
    • 1
  • a

    Alex Berry

    03/31/2025, 5:30 PM
    What might cause prediction intervals to look this jagged? I am getting intervals similar to this when using HoltWinters and ARIMA models. The lower and upper bounds are the 2.5 and 97.5 percentiles, respectively.
  • s

    Santosh Srivatsa

    04/08/2025, 2:54 AM
    Hello everyone, I’m using AutoARIMA from Nixtla’s StatsForecast to detect missed transmissions from multiple data sources, each identified by a unique
    unique_id
    . Here’s a quick overview of my workflow: 1. Data Preparation: ◦ I create two DataFrames (train and test), both with columns:
    unique_id
    ,
    ds
    , and
    y
    . ◦ The train and test DataFrames have the same end date but different start dates. 2. Model Setup and Fitting: ◦ I initialize the
    StatsForecast
    object with the
    AutoARIMA
    model, setting parameters like
    season_length
    and
    freq
    . ◦ I call the
    .fit()
    method on the training DataFrame. 3. Forecasting and In-Sample Predictions: ◦ I forecast using the
    .forecast(h=1)
    method on the test DataFrame. ◦ I also use
    .forecast_fitted_values()
    after fitting to retrieve in-sample predictions, and I flag anomalies based on the
    level
    prediction intervals
    by checking whether actuals fall outside the expected range. I’m doing this because I’m specifically trying to detect missed transmissions, meaning there may not be a value at a fixed point in the future—so a direct forecast of future values isn’t always meaningful. Instead, I’m comparing the model’s in-sample expectations against actuals. Additionally, when I try to introduce exogenous variables, I’m running into an invalid shape error when calling
    .forecast()
    . I suspect this might be because the test dataset doesn’t have the same number of rows for each
    unique_id
    . Would love any guidance on whether this overall workflow makes sense or how I might improve it—especially around incorporating exogenous variables or detecting anomalies more robustly. Thanks in advance for your insights!
    o
    • 2
    • 1
  • s

    Sergio André López Pereo

    04/09/2025, 7:39 PM
    Hello nixtla team, and thanks in advance for your time. I'm running into some kind of issue using the AutoTBATS model. There's some cases where the model just kinda "explodes". It makes the prediction and it seems it transformed it into something exponential for some kind of reason. There's explosion in the positive values and also in the negatives. Do you have any kind of hint of what would it be?
  • m

    Mariana Menchero

    04/09/2025, 7:57 PM
    Hi @Sergio André López Pereo do you have a reproducible example you can share with us?
    s
    • 2
    • 2
  • i

    Iching Quares

    04/10/2025, 2:49 PM
    Hello Nixtla team, I'm not sure if I'm missing anything, but is there any way to jointly fit a arima+garch model, similar on how it's done in the rugarch R package
    Copy code
    spec <- ugarchspec(variance.model = list(garchOrder = c(1, 1)), 
                         mean.model = list(armaOrder = c(final.order[1], final.order[3]), include.mean = TRUE), 
                         distribution.model = "std", 
                         fixed.pars = fixed_pars_df0)
  • a

    Ankit Hemant Lade

    04/11/2025, 2:38 AM
    In statsforecast cross validation is there any way i can give explicit cut off date?
    o
    • 2
    • 1
  • i

    IHAS

    04/11/2025, 3:36 PM
    I am using StatsForecast to process over 100 time series... Is there a way to enable a verbose mode to track which series is currently being processed, or at least estimate the remaining time for the entire process?
    o
    • 2
    • 1
  • s

    Sai krishna Sirikonda

    05/02/2025, 9:58 AM
    Hi all, I am working on hierarchical forecasting and I want to know the four to five models which can give better results comparatively.
    o
    • 2
    • 1
  • s

    Sai krishna Sirikonda

    05/05/2025, 5:32 AM
    Hi all, can anyone help me determine when to perform the cross-validation step in hierarchical forecasting—before or after reconciliation?
    o
    • 2
    • 2
  • f

    Filipa Encarnação Louzeiro

    05/06/2025, 12:01 PM
    Hi everyone, Today i got the strangest error in a code that used to run without problems. The code (in PySpark, Databricks) is:
    Copy code
    models = [AutoETS()] 
    
    fcst = StatsForecast(models=models,
                         freq='M',
                         n_jobs=1,
                         fallback_model = SeasonalNaive(season_length = 12))
    
    # FORECAST
    df_pred = fcst.forecast(df = df_train, 
                            h = 1,
                            fitted = True, 
                            level=[90])
    But then this error showed up:
    Copy code
    PythonException: 
      An exception was thrown from the Python worker. Please see the stack trace below.
    Traceback (most recent call last):
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue_spark/execution_engine.py", line 228, in _udf_pandas
        output_df = map_func(cursor, input_df)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/extensions/_builtins/processors.py", line 333, in run
        return self.transformer.transform(df)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/extensions/transformer/convert.py", line 346, in transform
        return self._wrapper.run(
               ^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/fugue/dataframe/function_wrapper.py", line 103, in run
        rt = self._func(**rargs)
             ^^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/distributed/fugue.py", line 166, in _forecast_noX_fitted
        model, result = self._forecast(
                        ^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/distributed/fugue.py", line 109, in _forecast
        result = model.forecast(
                 ^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/core.py", line 864, in forecast
        res_fcsts = self.ga.forecast(
                    ^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/core.py", line 199, in forecast
        res_i = fallback_model.forecast(
                ^^^^^^^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/models.py", line 3844, in forecast
        res = _add_fitted_pi(res=res, se=sigma, level=level)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-153d963c-fd2f-4eea-8c6e-a0d339c0421b/lib/python3.11/site-packages/statsforecast/models.py", line 62, in _add_fitted_pi
        lo = res["fitted"].reshape(-1, 1) - quantiles * se.reshape(-1, 1)
                                                        ^^^^^^^^^^
    AttributeError: 'int' object has no attribute 'reshape'
    File <command-993558187105440>, line 14
          4 fcst = StatsForecast(models=models,
          5                      freq='M',
          6                      n_jobs=1,
          7                      fallback_model = SeasonalNaive(season_length = 12))
          9 # fcst = StatsForecast(models=models,
         10 #                      freq='M',
         11 #                      n_jobs=1)
         12 
         13 # FORECAST
    ---> 14 df_pred = fcst.forecast(df = df_train, 
         15                         h = 1,
         16                         fitted = True, 
         17                         level=[90])\
         18     .withColumnRenamed('unique_id','CCLI_ID')\
         19     .withColumnRenamed('ds','date')
    
    File /databricks/spark/python/pyspark/sql/connect/client/core.py:2155, in SparkConnectClient._handle_rpc_error(self, rpc_error)
       2140                 raise Exception(
       2141                     "Python versions in the Spark Connect client and server are different. "
       2142                     "To execute user-defined functions, client and server should have the "
       (...)
       2151                     "<https://docs.databricks.com/en/release-notes/serverless.html>" target="_blank" rel="noopener noreferrer"><https://docs.databricks.com/en/release-notes/serverless.html></a>.</span><span>"</span>
       2152                 )
       2153             # END-EDGE
    -> 2155             raise convert_exception(
       2156                 info,
       2157                 status.message,
       2158                 self._fetch_enriched_error(info),
       2159                 self._display_server_stack_trace(),
       2160             ) from None
       2162     raise SparkConnectGrpcException(status.message) from None
       2163 else:
    It's really confusing to me. The Databricks assistant suggested two different things, in different runs. The first is: "The error occurs because the fallback_model in the StatsForecast is returning an integer instead of an array, which causes the reshape method to fail. To fix this, ensure that the fallback_model returns an array-like object that can be reshaped." Then, in a second run, it suggested: "The error occurs because the se variable is an integer, and the reshape method is being called on it, which is not valid. To fix this, ensure that se is a numpy array before calling reshape." Can anyone help? I'm sort of in panic with this 😬 Many many thanks!! Meanwhile, i removed the line
    fitted = True
    and it worked well. But what if i really need the fitted values?
  • s

    Sai krishna Sirikonda

    05/08/2025, 5:20 AM
    Hi all, I have a query regarding the storage of hierarchical time series forecasting models. Specifically, is it possible to save these models efficiently? I attempted to save the model as a bundle rather than as an individual model object using the following approach: model_bundle = { 'fcst': fcst_full, # Trained forecasting model(s) 'hrec': hrec, # Reconciliation logic 'S_df': S_df, # Hierarchy matrix 'tags': tags, # Hierarchy metadata 'Y_df': Y_df, # Full dataset 'Y_hat_df': Y_hat_full_df, # Base forecasts 'Y_fitted_df': Y_fitted_full_df, # Fitted values 'Y_rec_df': Y_rec_full_df, # Reconciled forecasts 'evaluation': evaluation # Metrics } joblib.dump(model_bundle, 'hierarchical_forecast_bundle.joblib') Additionally, I noticed that the
    neuralforecast
    library offers dedicated methods for saving and loading models. Does the
    hierarchicalforecast
    library provide similar functionality for storing hierarchical forecasting models? I would appreciate any insights on best practices for saving and restoring hierarchical forecasting models. #save nf.save(path='./checkpoints/test_run/', model_index=None, overwrite=True, save_dataset=True) #load nf2 = NeuralForecast.load(path='./checkpoints/test_run/') Y_hat_df2 = nf2.predict() Y_hat_df2.head() Thank you in advance for your support!
    o
    • 2
    • 1