Hey Guys, Joined the community recently and I look...
# neural-forecast
Hey Guys, Joined the community recently and I look forward to learning from the discussions. I am starting a Time Series project with NeuralForecast where I am trying to train a model which will be used for One-shot learning on other time series (Transfer Learning). I read the article (https://nixtla.github.io/neuralforecast/examples/transfer_learning.html) and now I have two questions: 1. I also read this article which says I need to provide data in a long format and there needs to be a "unique_id" associated with each time series. (https://nixtla.github.io/neuralforecast/examples/getting_started_complete.html). The long format is fine and there would be unique ids to distinguish between different time series but while predicting new data there would be new unique IDs, so would that affect the model? So essentially does the model learn unique_ids as well? 2. If there is a big dataset with thousands of time series then there would be too many samples for the model so is there a way in which the model only uses some 'x' number of samples from each series? I came to know about the "max_samples_per_ts" hyperparameter which does the same in the Darts library ( https://unit8co.github.io/darts/examples/14-transfer-learning.html).
Hi @Raghuvansh! Thanks for using our library! Regarding your questions: 1. The models do not learn unique_ids, nor use them to forecast. Models can learn time-series specific dynamics only if you add static variables (such as one-hot encoding of the unique_id). 2. During training, each batch is created by randomly sampling a subset of
time series, and
windows from the subset. We dont have a parameter to limit the total number of windows from each time series. However, we dont usually train models with all possible windows from each time series, we recommend to train models with a fix number of training steps (
Regarding 2: My full dataset comprises of multiple small datasets and each small dataset has multiple time series. So rather than combining the small datasets I should train the model one by one separately on each small dataset using max_steps so that the model does not learn too much from one dataset. Does this approach make sense to you ?
This is a very interesting question. I believe it would be better to pool all the data together, so that the model does not end specializing more to forecast the latest datasets. Usually is better to have variety in each batch during the training step, than grouping potentially similar time series in each batch. In competitions such as the M4, high performing models such as the ESRNN and NBEATS were simultaneously trained on many time series from different domains and sizes.
In the near future we will also add the possibility to add weights to each unique_id and timestamp for gradient computation
👍 1
I was hoping to do that as well but there are some datasets which have significantly longer and more number of time series than others so more number of samples which would lead the overall dataset to become imbalanced. I could potentially randomly sample some fixed number of time series from larger datasets and then combine the smaller datasets to form a larger dataset. Would this be an ideal approach or should I think of something else ? Your thoughts are always appreciated