This message was deleted Nixtla Community #neural-forecast

Join Slack

This message was deleted.

# neural-forecast

Slackbot

02/21/2023, 9:19 PM

This message was deleted.

Cristian (Nixtla)

02/21/2023, 10:01 PM

Hi @Martin Bel! In our

AutoNHITS

class we have a predefined list of values for those hyperparameters to explore. For the

n_pool_kernel_size

we recommend exploring constant values across stacks (

[1,1,1]

[2,2,2]

, etc) or exponentially decreasing values (

[8,4,1]

[16,8,1]

). With

[1,1,1]

the model is not downsampling the input. The

n_freq_downsample

controls how much the output dimension is decreased in the blocks of each stack. For a particular stack, the output dimension of the MLP follows:

output_dim = h/n_freq_downsample

, where h is the forecasting horizon. For example, if you have hourly data and you are forecasting a week, h=168. By setting

n_freq_downsample=24

, each MLP of the stack will output 7 points (168/24), one for each day of the week. In your case, you are forecasting an year of weekly data. You can use

n_freq_downsample=13

to aggregate forecasts by quarter (52/13=4), and

n_freq_downsample=4

to have approximately one output for each month. In this case the final parameter will be

n_freq_downsample=[13,4,1]

Cristian (Nixtla)

02/21/2023, 10:03 PM

In both cases I recommend you to use the

AutoNHITS

to define a grid for these hyperparameter

Martin Bel

02/22/2023, 12:20 PM

Perfect! Thanks for the explanation! I saw the AutoNHITS example but the logic of how the parameters were being chosen wasn't clear. Just to clarify, in this model I'm passing as an argument

n_freq_downsample

= [13, 4, 1] but I don't see this changing the output dimension. These are all the parameters I used:

Copy code

h = 52
nbr_blocks = 3
linear_dim = 512

config_nhits = {
    "h":h,
    "input_size": h * 8,                                      # Length of input window
    "n_blocks": nbr_blocks*[1],                               # Length of input window
    "mlp_units": nbr_blocks*[[linear_dim, linear_dim]],       # Length of input window
    "n_pool_kernel_size": [2, 2, 2],                          # MaxPooling Kernel size
    "n_freq_downsample": [13, 4, 1],                           # Interpolation expressivity ratios
    "learning_rate": 1e-3,                                    # Initial Learning rate
    "scaler_type": "invariant",                                  # Scaler type
    "activation": "ReLU",
    "max_steps": 500,                                        # Max number of training iterations
    "batch_size": 128,                                        # Number of series in batch
    "windows_batch_size": None,                               # Number of windows in batch
    "random_seed": 123,                                       # Random seed
}

model = NHITS(**config_nhits)

Cristian (Nixtla)

02/22/2023, 4:31 PM

Yes, the final outputs do not change, the model is still producing a forecast of size h. What is changing is the "expressivity"/frequency of each stack. By reducing the output dimension it learns lower frequency patterns. The intermediate values are interpolated to produce the complete forecast (hence the name of our model). Here is a diagram:

Cristian (Nixtla)

02/22/2023, 4:31 PM

Thanks for the feedback, we will improve the documentation to make it more clear!

Martin Bel

02/22/2023, 5:17 PM

No problem! Thanks for the clarification

6 Views

Open in Slack

Previous Next