Jonathan Farland
09/13/2022, 9:47 PMHierarchicalForecast
for a few applications - are there are utility or helper functions to generate the S
matrix for a given hierarchy? thanks in advancefede (nixtla) (they/them)
09/14/2022, 9:01 PMS
matrix and the dataset with all hierarchies, you can use the aggregate
function (from hierarchicalforecast.utils *import* aggregate
). The function takes the time series of the lowest level and the hierarchical structure.
Please let me know if that example works for your use case. :)Jonathan Farland
09/14/2022, 9:03 PM---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/var/folders/01/4mysj8cx1bjg_w8rbw097jwr0000gp/T/ipykernel_22347/4254128446.py in <module>
7 hrec = HierarchicalReconciliation(reconcilers=reconcilers)
8 #Y_rec_df = hrec.reconcile(Y_hat_df, Y_df_train, S, tags)
----> 9 Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_df_train, S=S, tags=tags)
~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/hierarchicalforecast/core.py in reconcile(self, Y_hat_df, S, tags, Y_df, level, bootstrap)
148 kwargs = {key: common_vals[key] for key in kwargs}
149 fcsts_model = reconcile_fn(y_hat=y_hat_model, **kwargs)
--> 150 fcsts[f'{model_name}/{reconcile_fn_name}'] = fcsts_model['mean'].flatten()
151 if (pi and has_level and level is not None) or (bootstrap and level is not None):
152 for lv in level:
~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
3653 else:
3654 # set column
-> 3655 self._set_item(key, value)
3656
3657 def _setitem_slice(self, key: slice, value):
~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in _set_item(self, key, value)
3830 ensure homogeneity.
3831 """
-> 3832 value = self._sanitize_column(value)
3833
3834 if (
~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in _sanitize_column(self, value)
4533
4534 if is_list_like(value):
-> 4535 com.require_length_match(value, self.index)
4536 return sanitize_array(value, self.index, copy=True, allow_2d=True)
4537
~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/common.py in require_length_match(data, index)
555 """
556 if len(data) != len(index):
--> 557 raise ValueError(
558 "Length of values "
559 f"({len(data)}) "
ValueError: Length of values (4704) does not match length of index (2688)
fede (nixtla) (they/them)
10/05/2022, 9:15 PMY_df_train
data. I can see that it contains 672 base time series, but seeing the shape of the summing matrix S
, it seems that it is constructed for 673
base time series. To create S
, did you use the aggregate
function? Perhaps we are missing somethingJonathan Farland
10/05/2022, 9:20 PMdf
looks like:fede (nixtla) (they/them)
10/05/2022, 9:38 PMy
Jonathan Farland
10/05/2022, 9:42 PMY_df_train
fede (nixtla) (they/them)
10/05/2022, 11:08 PMstore36/dept6
has only two observations. Also, there are other time series with missing values. I solved the problem by imputing the missing values with zero; since we are dealing with demand data, it makes sense. I’m imputing the missing values for each time series from the first date of the series until the last date of the whole dataset. For example, the first observation of store36/dept6
is 2012-06-08
, and the last observation of the training set is 2012-10-26
, thus, that series will range from 2012-06-08
until`2012-10-26`. The second image shows this.Jonathan Farland
10/05/2022, 11:12 PMfede (nixtla) (they/them)
10/05/2022, 11:20 PMyou start from whenever the series begins, but to the global endYes, exactly
Jonathan Farland
10/05/2022, 11:22 PMfede (nixtla) (they/them)
10/05/2022, 11:30 PMJonathan Farland
10/06/2022, 6:44 PMfede (nixtla) (they/them)
10/06/2022, 9:18 PM