https://github.com/nixtla logo
#neural-forecast
Title
# neural-forecast
c

Cyril de Catheu

08/09/2023, 10:44 PM
Hey, I’m trying to understand how BaseWindows, used by N-Beats and NHITS generates training examples. In particular I’m trying to understand what’s happening in
training_step
,
_create_windows
and
_parse_windows
. I’m a bit confused by the code. Is there a paper or some doc that describes the idea of what’s done there?
c

Cristian (Nixtla)

08/09/2023, 10:53 PM
Hi @Cyril de Catheu! Its based on the idea of the NBEATS paper (https://arxiv.org/abs/1905.10437). These models operate in windows (of size
input_size
+
horizon
). The model will take the values in the input region (of size
input_size
) to predict the future
horizon
values. The training loss is based on the forecasting accuracy on the forecasted values. During each training step, we first sample
batch_size
different time series of the dataset. Next, we sample
windows_batch_size
different windows (starting at different random timestamps) from this subset of time series.
The
create_windows
function is in charge sampling the windows from the batch (sampled by the loader).
p

Phil

08/09/2023, 11:17 PM
@Cristian (Nixtla) I have a question on top of this... Do you also have the feature where if an end date is chosen at random in a batch of timeseries. And given the input_size, the start date would fall outside the span of the timeseries, then the timeseries window is padded with zeros on the left. I think this strategy was used in the DEEPAR paper to simulate cold start problems
c

Cristian (Nixtla)

08/09/2023, 11:18 PM
We posted a pull request to add padding yesterday!
p

Phil

08/09/2023, 11:19 PM
You are all on top of things! Awesome
🔥 1
c

Cristian (Nixtla)

08/09/2023, 11:19 PM
it will be merged to main soon
c

Cyril de Catheu

08/10/2023, 7:54 AM
Hey @Cristian (Nixtla) thanks a lot! I’m confused by the shape of temporal in this:
Copy code
def _create_windows(self, batch, step, w_idxs=None):
        # Parse common data
        window_size = self.input_size + self.h
        temporal_cols = batch["temporal_cols"]
        temporal = batch["temporal"]

        if step == "train":
            if self.val_size + self.test_size > 0:
                cutoff = -self.val_size - self.test_size
                temporal = temporal[:, :, :cutoff]
I thought temporal would be of order 2. But it seems it’s of order 3 in
temporal = temporal[:, :, :cutoff]
I had a look at TimeSeriesDataset but it did not help me to understand Edit: is it of shape:
batchSize, numTemporalColumns, maxSeriesSize
?
a

Antoine SCHWARTZ -CROIX-

08/10/2023, 9:21 AM
Hello Cyril, yes, if I remember correctly that's the shape. If you have any doubt, you can easily find out how it works with the help of this notebook (last cell).
c

Cyril de Catheu

08/10/2023, 10:04 AM
Ok thanks @Antoine SCHWARTZ -CROIX-! will do
c

Cristian (Nixtla)

08/15/2023, 3:13 PM
@Phil we just pushed to main the padding with zeros.
p

Phil

08/15/2023, 3:14 PM
Awesome! Thank you!
This is the latest push not the latest release correct?
c

Cristian (Nixtla)

08/15/2023, 3:14 PM
yes, the push. is not in pip yet
👍 1
5 Views