Alex Wang
09/19/2023, 9:45 PMTemporalNorm
uses one set of normalization per sequence (basically InstanceNorm for sequence data without learnable parameters)? I would have expected the offset and scale to be calculated over the entire training dataset and kept constant for every sequence @Cristian (Nixtla)Cristian (Nixtla)
09/19/2023, 10:35 PMAlex Wang
09/19/2023, 10:36 PMCristian (Nixtla)
09/19/2023, 10:38 PMAlex Wang
09/19/2023, 10:39 PMCristian (Nixtla)
09/19/2023, 10:39 PMscaler_type=None
Alex Wang
09/19/2023, 10:43 PM