#neural-forecast

Title

m

Martin Bel

02/21/2023, 9:19 PMHi all, congrats on the amazing library!
I've been experimenting a bit with NHiTS on a dataset with weekly frequency, strong seasonality and trend.
Question:
I'm still trying to understand how to define the

`n_pool_kernel_size`

and `n_freq_downsample`

parameters.
Are there any guidelines for this, in relation to the frequency of the series?
It would be great if someone can provide some intuition on how these affect the model.
Would these be reasonable values?
"n_pool_kernel_size": [2, 2, 2], # MaxPooling Kernel size
"n_freq_downsample": [52, 24, 1], # Interpolation expressivity ratios
I'm getting a reasonable result but wanted to see if it's possible to improve. This is an example of the data and predictions.c

Cristian (Nixtla)

02/21/2023, 10:01 PMHi **@Martin Bel**!
In our

`AutoNHITS`

class we have a predefined list of values for those hyperparameters to explore.
For the `n_pool_kernel_size`

we recommend exploring constant values across stacks (`[1,1,1]`

,`[2,2,2]`

, etc) or exponentially decreasing values (`[8,4,1]`

,`[16,8,1]`

). With `[1,1,1]`

the model is not downsampling the input.
The `n_freq_downsample`

controls how much the output dimension is decreased in the blocks of each stack. For a particular stack, the output dimension of the MLP follows: `output_dim = h/n_freq_downsample`

, where h is the forecasting horizon. For example, if you have hourly data and you are forecasting a week, h=168. By setting `n_freq_downsample=24`

, each MLP of the stack will output 7 points (168/24), one for each day of the week. In your case, you are forecasting an year of weekly data. You can use `n_freq_downsample=13`

to aggregate forecasts by quarter (52/13=4), and `n_freq_downsample=4`

to have approximately one output for each month. In this case the final parameter will be `n_freq_downsample=[13,4,1]`

.In both cases I recommend you to use the

`AutoNHITS`

to define a grid for these hyperparameterm

Martin Bel

02/22/2023, 12:20 PMPerfect! Thanks for the explanation! I saw the AutoNHITS example but the logic of how the parameters were being chosen wasn't clear.
Just to clarify, in this model I'm passing as an argument

`n_freq_downsample`

= [13, 4, 1] but I don't see this changing the output dimension.
These are all the parameters I used:
Copy code

```
h = 52
nbr_blocks = 3
linear_dim = 512
config_nhits = {
"h":h,
"input_size": h * 8, # Length of input window
"n_blocks": nbr_blocks*[1], # Length of input window
"mlp_units": nbr_blocks*[[linear_dim, linear_dim]], # Length of input window
"n_pool_kernel_size": [2, 2, 2], # MaxPooling Kernel size
"n_freq_downsample": [13, 4, 1], # Interpolation expressivity ratios
"learning_rate": 1e-3, # Initial Learning rate
"scaler_type": "invariant", # Scaler type
"activation": "ReLU",
"max_steps": 500, # Max number of training iterations
"batch_size": 128, # Number of series in batch
"windows_batch_size": None, # Number of windows in batch
"random_seed": 123, # Random seed
}
model = NHITS(**config_nhits)
```

c

Cristian (Nixtla)

02/22/2023, 4:31 PMYes, the final outputs do not change, the model is still producing a forecast of size h. What is changing is the "expressivity"/frequency of each stack. By reducing the output dimension it learns lower frequency patterns. The intermediate values are interpolated to produce the complete forecast (hence the name of our model). Here is a diagram:

Thanks for the feedback, we will improve the documentation to make it more clear!

m

Martin Bel

02/22/2023, 5:17 PMNo problem! Thanks for the clarification