uses one set of normalization per sequence (basically InstanceNorm for sequence data without learnable parameters)? I would have expected the offset and scale to be calculated over the entire training dataset and kept constant for every sequence @Cristian (Nixtla)
09/19/2023, 10:35 PM
Hi @Alex Wang, this type of normalization is increasingly common in time series. We have seen consistent good results with this approach, in particular in settings with large scale variation within and across time series.
However, we know that it has some limitations (for instance, loss of information). We are working on adding a separate pre-processing step that performs normalization over the complete time series
09/19/2023, 10:36 PM
Gotcha this is incredibly helpful! Thank you! Is there a reference that evaluates this type of temporal norm?
09/19/2023, 10:38 PM
For example, we just saw that our normalization can drop vital information in healthcare settings (your area?). Normalizing in a short window removes the information of the medication dosage, since it transform it to essentially a dummy (0-1)
09/19/2023, 10:39 PM
Exactly, that's what I was worried about since sometimes the baseline level is very important information