Hi all, I am relatively new to time series forecas...
# neural-forecast
Hi all, I am relatively new to time series forecasting, and wondering what is the proper way to treat to multiple independent time series. Consider the example of k data series x_{t_0}^{t_i}[k] from time t_0 to t_i (e.g. stocks price for different stocks, or data series data for different agents), and we want to predict their results for the next T steps x_{t_i+1}^{t_i+T+1}[k]. We cannot simply append these series as in text data, since we want to preserve the time information across series. I think a naive way to use the unique_id as the identifier for the different series. However, unique_id is a dataset feature, and I am not sure how it is treated for different models. And if it is always the suggested way to handle univariate multiple (independent/dependent) series?
Welcome @Yang Guo, Hope you find NeuralForecast useful. NeuralForecast uses the cross-learning technique where the model outputs independent predictions for each time series, while model's weights are being shared and trained over panel data. The 'unique_id' feature is not used by any model directly, but through the dataloaders that feed the data 'independently' to the NeuralForecast models. Currently most NeuralForecast models operate as univariate models in the sense that they receive univariate inputs (lags/autorregresive faetures) and output univariate forecasts. But the model is shared through the entire panel dataset. The training objective is: P(Y_{i,t+1:t+h}| Y_{i,t-L:t}) Some more complex models like the ESRNN use the 'unique_id' to query series' specific embeddings.
Thank you so much for the prompt response! Just to make sure I understand properly. The training objective is essentially to learn the sequential behavior conditioned on i, instead of viewing each series as an independent sample (e.g. the sequence does not need to have the same length.) And thus, it will only uses the information from the series itself instead of the other series at previous time stamps for predictions.
That is correct
Models operate as univariate autorregresive, but the model weights are shared across the series
Thank you so much for the clarification!
After reading the cross-learning paper, I am even more confused. My current understanding is that given multiple series {x}, {y}, {z}, the traditional TS model will build 3 independent model, whereas CL will build a single model for this sequence {(x,y,z)}. However, based on your description, neuralforcast is building a single model for ({x},{y},{z}).join(). Could you please make the clarification? Also, could you please point me to the specific place 'unique id' is used? I am currently not sure about how the code is able to utilize the correlation of multiple series at the same time stamp.
Thank you!
Hi @Yang Guo. It is more similar to the second option (if I understood it correctly). I will try to explain it again: • There is only one model (and therefore set of learned parameters) for all the time series in your dataset. • The models are UNIVARIATE. They only learn to forecast time series with their own historic values. For example {x_t-Lt} >{x tt+H}, and so own. This has the consequence that forecasts are independent between time series of your dataset. • Think of models as learning functions from historic windows of a single time series (and exogenous covariates) to future values, {x_t-Lt} >{x tt+H}.
You can find examples in most of the tutorials in our library. This is our particular tutorial on inputs: https://nixtla.github.io/neuralforecast/examples/data_format.html.
Thank you! This tutorial is super helpful. I think I was partially confused by the original learn cross learning paper, I found the notation from this survey https://arxiv.org/abs/2004.10240 to be much more clear.