Phil
09/22/2023, 3:14 PMn_freq_downsample
and n_pool_kernel_size
. My question may also stem from how these parameters interact with one another.
From my understanding, the higher the value in n_freq_downsample
, the less datapoints we will have after downsampling. i.e. if we have daily data and we downsample with n_freq_downsample = 7
, we will have 1/7 the amount of data.
n_pool_kernel_size
also has a "downsampling effect", if we have an input size of N
and n_pool_kernel_size = 2
, then we will now have N / 2
datapoints.
First of all, is this understanding correct? And if so, it seems like we would want these two hyperparameters be inversely proportional. I would not want to downsample by a large number and then apply large kernel size pooling layer. These would dramatically decrease the amount of information being fed into my model?
The reason I am asking this question is because I was looking at the AutoNHITS parameter space
"n_pool_kernel_size": tune.choice(
[[2, 2, 1], 3 * [1], 3 * [2], 3 * [4], [8, 4, 1], [16, 8, 1]]
),
"n_freq_downsample": tune.choice(
[
[168, 24, 1],
[24, 12, 1],
[180, 60, 1],
[60, 8, 1],
[40, 20, 1],
[1, 1, 1],
]
),
Intuitively, I would have assumed that the last two choices for n_pool_kernel_size
would have been reversed. I.E. [1, 4, 8], [1, 8, 16]
?Cristian (Nixtla)
09/25/2023, 7:49 PMn_pool_kernel_size
downsamples only the inputs, and the n_freq_downsample
only the output. Having both at the same time do not compound, because they affect different parts of the architecture. Here is the diagram of the paper, the kernel controls the Maxpool im the inputs of the MLP stack, and n_freq_downsample
controls the output dimension of theta (points in the forecasting window)n_freq
) you need less information from the inputs, so kernel is largerPhil
09/25/2023, 7:54 PMCristian (Nixtla)
09/25/2023, 9:56 PM