In <https nixtlaverse nixtla io statsforecast docs models mu Nixtla Community #general

In <MSTL> doc, all the percentile confidence inter...

Ken Lee

08/23/2024, 7:23 PM

In MSTL doc, all the percentile confidence intervals are the same as the expected value, I used it with my own data as well, it's all the same. Is there a bug?

👍 1

Mariana Menchero

08/23/2024, 8:44 PM

Hi @Ken Lee Thanks for bringing this issue to our attention. Can you help us by opening an issue on statsforecast? I've used the MSTL recently and got proper prediction intervals, so I'm curious about your use case that is showing the same problem as the tutorial.

👍 1

Ken Lee

08/23/2024, 9:03 PM

But you can see in this documents all predictions are same values right?

Copy code

MSTL	MSTL-lo-95	MSTL-lo-80	MSTL-hi-80	MSTL-hi-95

Mariana Menchero

08/23/2024, 9:11 PM

yes, no question about that. Will check that example ASAP. But it's interesting you are experiencing the same issue

Ken Lee

08/23/2024, 9:11 PM

the graphs as well.

Ken Lee

08/23/2024, 9:12 PM

so... from your experience.. it should work and doc is wrong is what I'm gathering.

Mariana Menchero

08/23/2024, 9:13 PM

With the AirPassengers dataset, for example, I'm getting the correct intervals. Here's a reproducible example:

Copy code

import os 
from statsforecast import StatsForecast
from statsforecast.models import MSTL 
from statsforecast.utils import AirPassengersDF as ap 
os.environ['NIXTLA_ID_AS_COL'] = '1'

sf = StatsForecast(
  models=[MSTL(season_length=[12])], 
  freq = 'MS'
)

fc = sf.forecast(df=ap, h=12, level=[80,95])

StatsForecast.plot(ap, fc, level=[80,95])

Mariana Menchero

08/23/2024, 9:13 PM

yeah, as I said, I'm not sure what is going on with that doc, but the intervals shouldn't look like that

Ken Lee

08/23/2024, 9:14 PM

I confirmed your example do work with right intervals...

Mariana Menchero

08/23/2024, 11:01 PM

there's indeed an error in the tutorial. The

seasonal_length

is set to an absurd number (8766). With the correct value (24*7), the prediction intervals look ok. It should be in main by Monday 🙂

Ken Lee

08/23/2024, 11:10 PM

I concluded the same, somehow, the longer your seasonal (relative to training data), the smaller the confidence intervals. This is counter intuitive though to what we understand about conformal predictions, maybe it's indicative of deeper bug.

👍 1

Ken Lee

08/23/2024, 11:23 PM

Copy code

import os 
from statsforecast import StatsForecast
from statsforecast.models import MSTL 
from statsforecast.utils import AirPassengersDF as ap 
import numpy as np
os.environ['NIXTLA_ID_AS_COL'] = '1'


evaluation = []
grid = np.arange(10, ap.shape[0], 10)

for g in grid:

    sf = StatsForecast(
    models=[MSTL(season_length=[g])], 
    freq = 'MS'
    )
    fc = sf.forecast(df=ap, h=12, level=[80,95], id_col="unique_id")
    evaluation.append((g, np.mean(fc["MSTL-hi-95"] - fc["MSTL"])))


pd.DataFrame(evaluation, columns = ["seasonal_length", "distance"]).plot.scatter(x="seasonal_length", y="distance", title="seasonal length vs. confidence interval distance")

Mariana Menchero

08/24/2024, 12:46 AM

Tbh, I had never considered this issue. I would argue that the seasonal length shouldn't vary that much, since at the end of the day, it's tied to whatever phenomena the series is representing. And it's always worth checking out the code of a model, although maybe some more examples are needed to point out where the error could be, if there is any.

Ken Lee

08/24/2024, 4:23 AM

I think it's not hard to stumble this problem, assume you know that there is 24 seasonality, then weekly seasonality, and end of the month seasonality (assume we are taking about some ride share business, it could be anything really) Then you would have seasonal represented in hours =

[24, 24 * 7, 24 * 7 *30]

. this is not too wild right, now.... you train a model but for some time series, you only have 30 days worth of data.... then you hit this diminishing CI bug.... where your 24 * 7 * 30 completely makes your CI went away. The user has no way to know why this is not working, this model worked on time series with a year worth of data, but all of a sudden the CI shrink for a shorter time frame... 🧩

👍 1

Valeriy

08/30/2024, 10:07 AM

I think plain vanilla implementation of conformal prediction might not be optimal for seasonality, perhaps Nixtla can consider more suitable implementation using some ideas from here for example https://arxiv.org/abs/2406.16766

Open in Slack

Previous Next