Hi all, Here is a takeaway from some experiments I...
# neural-forecast
m
Hi all, Here is a takeaway from some experiments I made with NHiTS. From all the parameters I made experiments with, the scaler_type was the one that had the largest effect on the results. With the default (scaler_type="identity") type I get this kind of result:
if I use any of the other scaler_type options such as "robust" or "invariant" I get very different results. This is with scaler="robust"
To me the first one looks off BUT the MAE, MAPE are actually similar. Perhaps a different
scaler_type
should be set as default. Any thoughts on this?
f
@Martin Bel Thanks for sharing your findings. I have not played with the manual fit that much. But I know AutoNHITS tries different scaler_type options and gives you the best one. My guess is that there is no one size fits all option otherwise they would have not tried all of them in the Auto mode. So I think you need to try all of them (or let Auto try them for you) as different options might fit better to different series.
c
@Martin Bel thanks for sharing your results! As @Farzad E mentioned, there is no single configuration that works better in all cases. In our original paper for the NHITS model we did not scale the inputs (scaler_type 'identity' or None), following the NBEATS. We have also found that scaling usually works better, so we are considering defining
robust
as the default.
👍 1
m
I was reading the Neural Prophet paper and found this interesting paragraph, relevant to this discussion. It seems their default normalization does this: "scales the minimum value to 0.0 and the 95th quantile to 1.0" which seems super hacky but might work ok if the data has outliers. Using the 95th percentile is quite high but perhaps 99th is ok. I guess you could do this preprocessing manually anyway.
f
@Martin Bel Thanks for sharing. I still have to check Neural Prophet. Does it work well in practice? Prophet didn't end up being reliable for most industry applications. https://medium.com/geekculture/is-facebooks-prophet-the-time-series-messiah-or-just-a-very-naughty-boy-8b71b136bc8c
c
Thanks for sharing that @Martin Bel. Its definitely a valid approach. Our
robust
scaler already deals with outliers, by for example using median instead of mean as scale. One key difference with our normalization is that for window based models (MLP, NBEATS, NHITS, TFT) we are normalizing each input window separately, instead of the whole time series.
m
I see, the robust one makes sense to me, I think this soft method can work better when you have sparse data. I'm just surprised this is their default method. I guess normalizing the entire series is just wrong, right? I'm not sure if NP is doing this but it's good to know this is how nixtla handles it! I haven't used NP much @Farzad E but prophet is generally not amazing, it's just easy to use.