@José Morales I've upgraded to a paid API acct to facilitate more testing/optimization.
I came across an article, where someone tested how much historical data is used by the model. he found the max number of rows used (anything > [number]) didn't appear to change predictions. He had a pessimistic view of it, but I find it so much more impressive if that's true.
Few related Qs:
1. Can you share any insight into how much historical data would be leveraged for various horizons?
a. Would want to pass in the minimum amount to reduce latency
b. no sense increasing costs for you with unnecessary data while on API plan
c. Assuming we end up with self-hosted or azure deploy, this has significant impact on cost estimation
2. For the training data set you used, up to what dates did that include? before forward testing, would like to confirm that the data we're testing wasn't included in your training set.
3. For enterprise self-hosted, would we have any additional controls over saving/continuously fine-tuned models?
4. Do you find the long horizon model can work better for horizons < 24 hrs? I've mainly tested forecasts < 2 hrs (freq * 24 ), but would like to see how it does with longer. Not sure where the line is for long model. Any insight here is appreciated.
5. Somewhat related: the docs have an interesting clue "Currently, the historical forecasts are not affected by
h
, and have a fix horizon depending on the frequency of the data." - Is that predefined? We'll be forward testing, but if we knew what the auto-h was for our freq=5min, we'd have a good idea of where what our benchmark is. This would also be very informative in looking at history to benchmark performance over different
n training rows