When using AutoARIMA or other methods in StatsFore...
# general
f
When using AutoARIMA or other methods in StatsForecast we get confidence intervals (CIs) for each individual forecast. Does anyone know how we can combine these CIs to get one CI for the average of all forecasted series? I am sure we cannot average CIs because calculation of confidence interval is not a linear operation. But I have never seen how bunch of CIs can be combined to represent one CI for the average series. This is more of a math question than related to StatsForecast but I thought someone here might have thought of this before.
f
Hey @Farzad E! The underlying assumption behind some statistical models (such as ARIMA) is that the data they are modeling follows a normal distribution. To compute CI, the mean and the variance of such normal random variable are calculated for each timestamp in the horizon. So, you could compute those parameters (the mean corresponds to the point forecasts and you can calculate the standard deviation by computing a CI with a level of
68.27
). After that, you can compute the parameters of the average series (using the fact that the sum of normally distributed random variables is also a normal random variable).
f
@fede (nixtla) (they/them) 68.27 because that's one standard deviation away from the mean? And then in that case the lower and the upper bounds of my CI is just the STDev and the mean is the point forecast so I end up let's say with 100 means and 100 STDevs (assuming I have 100 time series) at each point in the horizon and I can assume these means and STDevs are normally distributed so I take average of my averages at each point in time but then can I take the average of my STDevs to build the global CI? I think I'm getting close here but not totally there yet. Need to better understand how to calculate the STDev for the final calculation of the global CI.
f
hey @Farzad E! Yes, that’s the intuition behind 68.27. Regarding the distribution of the average, if you have 100 series each of one with mean
m_i
, and standard deviation
sigma_i
then the distribution of the average will have mean
(m_1 + m_2 + ... + m_100)/100
and standard deviation
(sigma^2_1 + sigma^2_2 + ... + sigma^2_100) ^ (1/2) / 100
(the root of the sum of the variances divided by 100).
🙌 1
f
@fede (nixtla) (they/them) Thanks a lot!
🙌 1