:mega: We just released v1.0.0 of HierarchicaForec...
# hierarchicalforecast
o
šŸ“£ We just released v1.0.0 of HierarchicaForecast, which adds the following features: • šŸ»ā€ā„ļø Polars support (#305). This PR adds Polars support for HierarchicalForecast, and adds an example to the docs. You can now enjoy full compatibility with Polars in HF! • šŸ™ Evaluation unified with utilsforecast (#311). We restructured the evaluation functionality in HierarchicalForecast to follow the same API as utilsforecast, further unifying API throughout our libraries. The old evaluation functions will be deprecated in future releases, and documentation has been adapted to show the new behavior. Note that as of v1.0.0, hierarchicalforecast no longer supports the use of
unique_id
as index column. Simply put, your input data should always be a flat table without an index. Thanks to MarcoGorelli for helping with Narwhals implementation, which has allowed us to implement Polars swiftly! Questions or suggestions for new features? Let us know as a comment or file an issue on Github. Our priority for the next release is adding temporal hierarchical reconciliation methods, but open to suggestions otherwise! Happy forecasting!
šŸ™Œ 1
b
With the drop of
unique_id
as an index column, I'm now running into issues when using reconciliarion methods that depend on Y_df. For example:
Copy code
reconcilers = [
    BottomUp(),
    MinTrace(method='wls_var'),
]
# reconcile forecasts
hrec = HierarchicalReconciliation(reconcilers=reconcilers)
Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df[['unique_id','ds','y_pred']], 
                            Y_df=Y_train_df[['unique_id','ds','y_pred','y']],
                            S=S_df, tags=tags)
Gives me the error:
Copy code
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[95], line 23
     21 # reconcile forecasts
     22 hrec = HierarchicalReconciliation(reconcilers=reconcilers)
---> 23 Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df[['unique_id','ds','q50']], 
     24                             Y_df=Y_fitted_df[['unique_id','ds','q50','y']],
     25                             S=S_df, tags=tags)

File ~/anaconda3/lib/python3.10/site-packages/hierarchicalforecast/core.py:374, in HierarchicalReconciliation.reconcile(self, Y_hat_df, S, tags, Y_df, level, intervals_method, num_samples, seed, is_balanced, id_col, time_col, target_col)
    370     if any_sparse and not nw.dependencies.is_pandas_dataframe(Y_df):
    371         raise ValueError(
    372             "You have one or more sparse reconciliation methods. Please convert `Y_df` to a pandas DataFrame."
    373         )
--> 374     y_insample = self._prepare_Y(
    375         Y_nw=Y_nw,
    376         S_nw=S_nw,
    377         is_balanced=is_balanced,
    378         id_col=id_col,
    379         time_col=time_col,
    380         target_col=target_col,
    381     )
    382     reconciler_args["y_insample"] = y_insample
    384 Y_tilde_nw = nw.maybe_reset_index(Y_hat_nw.clone())
...
-> 6130     raise KeyError(f"None of [{key}] are in the [{axis_name}]")
   6132 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
   6133 raise KeyError(f"{not_found} not in index")

KeyError: "None of [MultiIndex([('y', 1577836800000000000),\n            ('y', 1580515200000000000),\n            ('y', 1583020800000000000),\n            ('y', 1585699200000000000),\n            ('y', 1588291200000000000),\n            ('y', 1590969600000000000),\n            ('y', 1593561600000000000),\n            ('y', 1596240000000000000),\n            ('y', 1598918400000000000),\n            ('y', 1601510400000000000),\n            ('y', 1604188800000000000),\n            ('y', 1606780800000000000),\n            ('y', 1609459200000000000),\n            ('y', 1612137600000000000),\n            ('y', 1614556800000000000),\n            ('y', 1617235200000000000),\n            ('y', 1619827200000000000),\n            ('y', 1622505600000000000),\n            ('y', 1625097600000000000),\n            ('y', 1627776000000000000),\n            ('y', 1630454400000000000),\n            ('y', 1633046400000000000),\n            ('y', 1635724800000000000),\n            ('y', 1638316800000000000),\n            ('y', 1640995200000000000),\n            ('y', 1643673600000000000),\n            ('y', 1646092800000000000),\n            ('y', 1648771200000000000),\n            ('y', 1651363200000000000),\n            ('y', 1654041600000000000),\n            ('y', 1656633600000000000),\n            ('y', 1659312000000000000),\n            ('y', 1661990400000000000),\n            ('y', 1664582400000000000)],\n           names=[None, 'ds'])] are in the [columns]"
All works fine when I can avoid using Y_df
o
Hey, that's unfortunate - can you post an example that I can run to help you figure out the issue?
And can you post the entire error message? The
....
contain the critical piece of information that I need to find out where the error occurs.
a
Hi, Do you have any updates regarding when the temporal hierarchies reconciliation feature will roll out? Will it also support Polars?
o
I have a draft PR online but there's a number of issues to be addressed that I couldn't fix yet, that's why it's taking more time. It will be fully compatible with Polars (as will be everything in HF as of since the last version)
I'd hope to fix those issues this month
a
Great! Thanks
p
same error here
Copy code
KeyError: "None of [MultiIndex([('y', 1577836800000000000),\n            ('y', 1580515200000000000),\n            ('y', 1583020800000000000),\n            ('y', 1585699200000000000),\n            ('y', 1588291200000000000),\n            ('y', 1590969600000000000),\n            ('y', 1593561600000000000),\n            ('y', 1596240000000000000),\n            ('y', 1598918400000000000),\n            ('y', 1601510400000000000),\n            ('y', 1604188800000000000),\n            ('y', 1606780800000000000),\n            ('y', 1609459200000000000),\n            ('y', 1612137600000000000),\n            ('y', 1614556800000000000),\n            ('y', 1617235200000000000),\n            ('y', 1619827200000000000),\n            ('y', 1622505600000000000),\n            ('y', 1625097600000000000),\n            ('y', 1627776000000000000),\n            ('y', 1630454400000000000),\n            ('y', 1633046400000000000),\n            ('y', 1635724800000000000),\n            ('y', 1638316800000000000),\n            ('y', 1640995200000000000),\n            ('y', 1643673600000000000),\n            ('y', 1646092800000000000),\n            ('y', 1648771200000000000),\n            ('y', 1651363200000000000),\n            ('y', 1654041600000000000),\n            ('y', 1656633600000000000),\n            ('y', 1659312000000000000),\n            ('y', 1661990400000000000),\n            ('y', 1664582400000000000),\n            ('y', 1667260800000000000),\n            ('y', 1669852800000000000),\n            ('y', 1672531200000000000),\n            ('y', 1675209600000000000),\n            ('y', 1677628800000000000),\n            ('y', 1680307200000000000),\n            ('y', 1682899200000000000),\n            ('y', 1685577600000000000),\n            ('y', 1688169600000000000),\n            ('y', 1690848000000000000),\n            ('y', 1693526400000000000),\n            ('y', 1696118400000000000),\n            ('y', 1698796800000000000),\n            ('y', 1701388800000000000),\n            ('y', 1704067200000000000),\n            ('y', 1706745600000000000),\n            ('y', 1709251200000000000),\n            ('y', 1711929600000000000),\n            ('y', 1714521600000000000),\n            ('y', 1717200000000000000),\n            ('y', 1719792000000000000),\n            ('y', 1722470400000000000),\n            ('y', 1725148800000000000),\n            ('y', 1727740800000000000),\n            ('y', 1730419200000000000),\n            ('y', 1733011200000000000)],\n           names=[None, 'ds'])] are in the [columns]"
File <command-2311550170284745>, line 1
----> 1 Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df,Y_df=Y_fitted_df,S=S_df,tags=tags)
I'm wondering do you solve it finally? @Brandon Barber
o
The error message looks like a column is missing, make sure you check the proper format: no Pandas index and columns have the proper names. If you deviate from the standard column names, make sure to supply the functions with those names. But I can help further if a piece of code is provided that reproduces the error.
p
Thank you Olivier. When I update the pandas version, the problem was solved.
šŸ‘ 1
f
Hi @Olivier nice to meet you. I have the same Error @Brandon Barber and @pu xu had. It seems that updating Pandas is the way to solve, but I have to use the .reconcile() methon in an environment with no possibility to update anything... any work-around to use? I mean my Y_df dataframe have the right columns and unique_id is not in index but still the method get me the error. Thanks in advance P.S. This is the error: KeyError: "None of [MultiIndex([('y', 1577836800000000000),\n ('y', 1580515200000000000),\n ('y', 1583020800000000000),\n ('y', 1585699200000000000),\n ('y', 1588291200000000000),\n ('y', 1590969600000000000),\n ('y', 1593561600000000000),\n ('y', 1596240000000000000),\n ('y', 1598918400000000000),\n ('y', 1601510400000000000),\n ('y', 1604188800000000000),\n ('y', 1606780800000000000),\n ('y', 1609459200000000000),\n ('y', 1612137600000000000),\n ('y', 1614556800000000000),\n ('y', 1617235200000000000),\n ('y', 1619827200000000000),\n ('y', 1622505600000000000),\n ('y', 1625097600000000000),\n ('y', 1627776000000000000),\n ('y', 1630454400000000000),\n ('y', 1633046400000000000),\n ('y', 1635724800000000000),\n ('y', 1638316800000000000),\n ('y', 1640995200000000000),\n ('y', 1643673600000000000),\n ('y', 1646092800000000000),\n ('y', 1648771200000000000),\n ('y', 1651363200000000000),\n ('y', 1654041600000000000),\n ('y', 1656633600000000000),\n ('y', 1659312000000000000),\n ('y', 1661990400000000000),\n ('y', 1664582400000000000),\n ('y', 1667260800000000000),\n ('y', 1669852800000000000),\n ('y', 1672531200000000000),\n ('y', 1675209600000000000),\n ('y', 1677628800000000000),\n ('y', 1680307200000000000),\n ('y', 1682899200000000000),\n ('y', 1685577600000000000),\n ('y', 1688169600000000000),\n ('y', 1690848000000000000),\n ('y', 1693526400000000000),\n ('y', 1696118400000000000),\n ('y', 1698796800000000000),\n ('y', 1701388800000000000),\n ('y', 1704067200000000000),\n ('y', 1706745600000000000),\n ('y', 1709251200000000000),\n ('y', 1711929600000000000),\n ('y', 1714521600000000000),\n ('y', 1717200000000000000),\n ('y', 1719792000000000000),\n ('y', 1722470400000000000),\n ('y', 1725148800000000000),\n ('y', 1727740800000000000),\n ('y', 1730419200000000000),\n ('y', 1733011200000000000)],\n names=[None, 'ds'])] are in the [columns]" File command-2311550170284745, line 1 ----> 1 Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df,Y_df=Y_fitted_df,S=S_df,tags=tags)
o
@Francesco Which version of pandas are you using? And if you post a piece of code that I can run I might be able diagnose better.
f
print(pd.version) 1.5.2 # ====================================== from hierarchicalforecast.utils import aggregate spec = [ ['Sales Organization'], ['Sales Organization', 'Pr. LOB'], ['Sales Organization', 'Pr. LOB', 'Pr. Ownership'], ['Sales Organization', 'Pr. LOB', 'Pr. Ownership', 'Pr. GRP 4.1'], ['Sales Organization', 'Pr. LOB', 'Pr. Ownership', 'Pr. GRP 4.1', 'EAN/UPC (NR)'] ] target_col = 'Promo Forecast' df_y, S, tags = aggregate( df[granular + ['ds', target_col]].rename(columns = {target_col: 'y'}), spec ) # ====================================== df_train_test = df_y[['unique_id', 'ds', 'y']].copy() # ====================================== # . . . Some Operations to split the data in train and test … #===================================================== #============== Forecasting Gerarchico =================== #===================================================== from statsforecast.core import StatsForecast from statsforecast.models import Naive fcst = StatsForecast( models = [Naive()], freq = 'MS', n_jobs = 1 ) df_pred = fcst.forecast(df = df_train, h = test_loops) from hierarchicalforecast.core import HierarchicalReconciliation from hierarchicalforecast.methods import BottomUp, TopDown, MiddleOut reconcilers = [ BottomUp(), TopDown(method = 'forecast_proportions') ] hrec = HierarchicalReconciliation(reconcilers = reconcilers) df_pred_reconciled = hrec.reconcile( Y_hat_df = df_pred, Y_df = df_train, S = S, tags = tags ) # ===================================================== File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec exec(code, globals, locals) File "c:\users\francesco.peria\a) python_cartella\a) sfa project\a) working directory v1\sfa_proj_nixtla\inferencenixtla_hierarchical_long_term _v1.py", line 327, in <module> df_pred_reconciled = hrec.reconcile( File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\hierarchicalforecast\core.py", line 374, in reconcile y_insample = self._prepare_Y( File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\hierarchicalforecast\core.py", line 248, in _prepare_Y Y_pivot = Y_nw.pivot( File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\narwhals\dataframe.py", line 1981, in pivot self._compliant_frame.pivot( File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\narwhals\_pandas_like\dataframe.py", line 1031, in pivot result = result.loc[:, ordered_cols] File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1067, in getitem return self._getitem_tuple(key) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1247, in _getitem_tuple return self._getitem_lowerdim(tup) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 941, in _getitem_lowerdim return self._getitem_nested_tuple(tup) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1047, in _getitem_nested_tuple obj = getattr(obj, self.name)._getitem_axis(key, axis=axis) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1301, in _getitem_axis return self._getitem_iterable(key, axis=axis) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1239, in _getitem_iterable keyarr, indexer = self._get_listlike_indexer(key, axis) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexing.py", line 1432, in _get_listlike_indexer keyarr, indexer = ax._get_indexer_strict(key, axis_name) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexes\multi.py", line 2626, in _get_indexer_strict return super()._get_indexer_strict(key, axis_name) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexes\base.py", line 6113, in _get_indexer_strict self._raise_if_missing(keyarr, indexer, axis_name) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexes\multi.py", line 2646, in _raise_if_missing return super()._raise_if_missing(key, indexer, axis_name) File "C:\Users\francesco.peria\.conda\envs\spyder-env\lib\site-packages\pandas\core\indexes\base.py", line 6173, in _raise_if_missing raise KeyError(f"None of [{key}] are in the [{axis_name}]") KeyError: "None of [MultiIndex([('y', 1514764800000000000),\n ('y', 1517443200000000000),\n ('y', 1519862400000000000),\n ('y', 1522540800000000000),\n ('y', 1525132800000000000),\n ('y', 1527811200000000000),\n ('y', 1530403200000000000),\n ('y', 1533081600000000000),\n ('y', 1535760000000000000),\n ('y', 1538352000000000000),\n ('y', 1541030400000000000),\n ('y', 1543622400000000000),\n ('y', 1546300800000000000),\n ('y', 1548979200000000000),\n ('y', 1551398400000000000),\n ('y', 1554076800000000000),\n ('y', 1556668800000000000),\n ('y', 1559347200000000000),\n ('y', 1561939200000000000),\n ('y', 1564617600000000000),\n ('y', 1567296000000000000),\n ('y', 1569888000000000000),\n ('y', 1572566400000000000),\n ('y', 1575158400000000000),\n ('y', 1577836800000000000),\n ('y', 1580515200000000000),\n ('y', 1583020800000000000),\n ('y', 1585699200000000000),\n ('y', 1588291200000000000),\n ('y', 1590969600000000000),\n ('y', 1593561600000000000),\n ('y', 1596240000000000000),\n ('y', 1598918400000000000),\n ('y', 1601510400000000000),\n ('y', 1604188800000000000),\n ('y', 1606780800000000000),\n ('y', 1609459200000000000),\n ('y', 1612137600000000000),\n ('y', 1614556800000000000),\n ('y', 1617235200000000000),\n ('y', 1619827200000000000),\n ('y', 1622505600000000000),\n ('y', 1625097600000000000),\n ('y', 1627776000000000000),\n ('y', 1630454400000000000),\n ('y', 1633046400000000000),\n ('y', 1635724800000000000),\n ('y', 1638316800000000000),\n ('y', 1640995200000000000),\n ('y', 1643673600000000000),\n ('y', 1646092800000000000),\n ('y', 1648771200000000000),\n ('y', 1651363200000000000),\n ('y', 1654041600000000000),\n ('y', 1656633600000000000),\n ('y', 1659312000000000000),\n ('y', 1661990400000000000),\n ('y', 1664582400000000000),\n ('y', 1667260800000000000),\n ('y', 1669852800000000000),\n ('y', 1672531200000000000),\n ('y', 1675209600000000000),\n ('y', 1677628800000000000),\n ('y', 1680307200000000000),\n ('y', 1682899200000000000),\n ('y', 1685577600000000000),\n ('y', 1688169600000000000),\n ('y', 1690848000000000000),\n ('y', 1693526400000000000),\n ('y', 1696118400000000000),\n ('y', 1698796800000000000),\n ('y', 1701388800000000000),\n ('y', 1704067200000000000),\n ('y', 1706745600000000000),\n ('y', 1709251200000000000),\n ('y', 1711929600000000000),\n ('y', 1714521600000000000),\n ('y', 1717200000000000000),\n ('y', 1719792000000000000),\n ('y', 1722470400000000000),\n ('y', 1725148800000000000),\n ('y', 1727740800000000000),\n ('y', 1730419200000000000),\n ('y', 1733011200000000000),\n ('y', 1735689600000000000),\n ('y', 1738368000000000000)],\n names=[None, 'ds'])] are in the [columns]"
@Olivier I inserted the code in the message above.. it's a bit messy here in the chat.. try to put in a text editor and it'll make sense. Pandas version is 1.5.2 and as I said I cannot change it. Thanks a lot for your helpāœŒļø
o
Hi, I can't reproduce it, the easiest is to just update your deps. We'll make sure HF enforces pandas>2.0.0 in upcoming releases, so that the error doesn't happen.
f
Hi @Olivier thanks for the attempt. If it could be useful to you, if I don't specify Y_df argument, leaving the default value, the error doesn't occur. But that can be done only because the reconcilers I'm using don't require Y_df (neither training data nor fitted values)
o
Mmm thanks. I'm happy to give it a go another time but I just ran our own examples with lower pandas deps and couldn;'t reproduce it. If you have a full example that I can run that produces the error (incl package deps) then I could do more, but otherwise it's probably just updating deps I guess....
f
Ok thanks.. I try to run the same code on another environment with pandas > 2.0 to see if it works. Thanks for the support šŸ‘Œ
šŸ‘ 1