Martin Bel
02/21/2023, 9:19 PMn_pool_kernel_size
and n_freq_downsample
parameters.
Are there any guidelines for this, in relation to the frequency of the series?
It would be great if someone can provide some intuition on how these affect the model.
Would these be reasonable values?
"n_pool_kernel_size": [2, 2, 2], # MaxPooling Kernel size
"n_freq_downsample": [52, 24, 1], # Interpolation expressivity ratios
I'm getting a reasonable result but wanted to see if it's possible to improve. This is an example of the data and predictions.Cristian (Nixtla)
02/21/2023, 10:01 PMAutoNHITS
class we have a predefined list of values for those hyperparameters to explore.
For the n_pool_kernel_size
we recommend exploring constant values across stacks ([1,1,1]
,[2,2,2]
, etc) or exponentially decreasing values ([8,4,1]
,[16,8,1]
). With [1,1,1]
the model is not downsampling the input.
The n_freq_downsample
controls how much the output dimension is decreased in the blocks of each stack. For a particular stack, the output dimension of the MLP follows: output_dim = h/n_freq_downsample
, where h is the forecasting horizon. For example, if you have hourly data and you are forecasting a week, h=168. By setting n_freq_downsample=24
, each MLP of the stack will output 7 points (168/24), one for each day of the week. In your case, you are forecasting an year of weekly data. You can use n_freq_downsample=13
to aggregate forecasts by quarter (52/13=4), and n_freq_downsample=4
to have approximately one output for each month. In this case the final parameter will be n_freq_downsample=[13,4,1]
.AutoNHITS
to define a grid for these hyperparameterMartin Bel
02/22/2023, 12:20 PMn_freq_downsample
= [13, 4, 1] but I don't see this changing the output dimension.
These are all the parameters I used:
h = 52
nbr_blocks = 3
linear_dim = 512
config_nhits = {
"h":h,
"input_size": h * 8, # Length of input window
"n_blocks": nbr_blocks*[1], # Length of input window
"mlp_units": nbr_blocks*[[linear_dim, linear_dim]], # Length of input window
"n_pool_kernel_size": [2, 2, 2], # MaxPooling Kernel size
"n_freq_downsample": [13, 4, 1], # Interpolation expressivity ratios
"learning_rate": 1e-3, # Initial Learning rate
"scaler_type": "invariant", # Scaler type
"activation": "ReLU",
"max_steps": 500, # Max number of training iterations
"batch_size": 128, # Number of series in batch
"windows_batch_size": None, # Number of windows in batch
"random_seed": 123, # Random seed
}
model = NHITS(**config_nhits)
Cristian (Nixtla)
02/22/2023, 4:31 PMMartin Bel
02/22/2023, 5:17 PM