Hey, thanks to the Nixtla team... we are using all...
# general
s
Hey, thanks to the Nixtla team... we are using all of their time series in production, however we've been having a debate on the best way to structure our models from a "unique id" point of view: We are predicting hourly time series, and thus our "unique id" is the store combined with a given hour; however there are multiple ways to think of this problem. How does Nixtla deal with multiple time series? Does it create 1 model for all unique_ids, or is it training separate models for each time series? The basic issue is the formation of a unique_id formed with certain categorical variables versus just having everything as a feature. In our example, would the unique_id just be the store and then everything else just be a feature? Further, this unique_id determines whether you are predicting at an hourly level, daily level, etc. Would love to hear thoughts from the community on this topic! Thank you.
m
Hi Mark, I'll do my best to answer without seeing your data! The way to think of the unique_id is just a label for the series. In your case, it seems that a store identifier is enough. As for the features, this is something you have to experiment. For example, suppose you have stores in different citites. Your unique_id could be a store id plus the name of the city, or you could leave the city as a feature. For that, you have to test and see if the city feature actually helps. As for how our libraries handle multiple series, it basically trains a global model for all series. However, you can try multiple models (and you should) on your dataset, and then select the best model for a particular series. We show how to do this in section 6 of this guide. Finally, your unique_id doesn't determine the frequency of your data. This is a parameter that you provide when fitting the models. So make sure to set the right frequency or resample your data to the frequency that you want. I hope this helps!
s
@Marco incredibly useful. Thank you!
❤️ 1
v
@Marco Isn’t statsforecast training individual models for each time series?