Fabian Müller
04/11/2022, 9:24 PMTimeSeriesDataset
and WindowsDataset
. While the first one stores an entire time-series the later one stores sliding windows created from time-series, correct?
The second approach feels natural when working with nbeats-like models (as well as other neural forecasting models). However, the implementation you opted confuses me a bit:
Previously, when creating my own generators, I have chosen the following recipe for feeding data during training to the model:
1. Sample n
time-series from the dataset
2. For each series, sample a split time point
3. Create one window for each series form the sampled splits (resulting in n
windows) and feed them to model as a mini-batch (n = batch_size)
This is also the approach described in Oreshkina et al. 2020 (where it is combined with stratified sampling).
The implementation you chosen for the WindowsDataset
(at least as far as I understand it) is different in that all possible windows are generated for a series and retuned (resulting in n != batch_size
).
I am wondering:
1. Why did you choose this implementation style, what are the advantages?
2. Is it possible to implement the Oreshkina et al. 2020 style sampling (especially the stratified version) using your package?
Best,
Fabian
Oreshkina et al. 2020: https://arxiv.org/pdf/2009.11961.pdfCristian (Nixtla)
04/11/2022, 10:00 PMn_windows
equal to the batch size you want. The WindowsDataset will first sample batch_size
series, and then select n_windows
windows from all the constructed windows of the batch_size
series.Kin Gtz. Olivares
04/11/2022, 10:07 PMCristian (Nixtla)
04/11/2022, 10:07 PMn_windows
=batch_size.
.Fabian Müller
04/12/2022, 6:54 AMn_windows
=batch_size
tipp. Will check it out. I also came across the eq_batch_size
argument in the FastTimeSeriesLoader that I am currently using.
But just to be sure, while both methods will result in the number of windows being equal to the batch_size, it is not guaranteed that it will be exactly one window per series. Correct?
And what do you think about the stratified sampling as mentioned in the paper? From what I understand, it might be especially relevant for your approach since you sampling from all windows and long series will produce more windows and therefore will be overrepresented in the training?Kin Gtz. Olivares
04/12/2022, 1:05 PM