Hi guys, when trying to use `fill_gaps` here: ```f...
# mlforecast
v
Hi guys, when trying to use
fill_gaps
here:
Copy code
from utilsforecast.preprocessing import fill_gaps
stocks_basic_pd = fill_gaps(stocks_basic_pd, freq='B', start='per_serie', end='per_serie', id_col='Ticker', time_col='Date')
I am getting the error below. Any ideas?
Copy code
{
	"name": "ValueError",
	"message": "cannot handle a non-unique multi-index!",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[54], line 2
      1 from utilsforecast.preprocessing import fill_gaps
----> 2 stocks_basic_pd = fill_gaps(stocks_basic_pd, freq='B', start='per_serie', end='per_serie', id_col='Ticker', time_col='Date')

File c:\\Python\\miniconda3\\envs\\openbb\\Lib\\site-packages\\utilsforecast\\preprocessing.py:166, in fill_gaps(df, freq, start, end, id_col, time_col)
    164         times += offset.base
    165 idx = pd.MultiIndex.from_arrays([uids, times], names=[id_col, time_col])
--> 166 res = df.set_index([id_col, time_col]).reindex(idx).reset_index()
    167 extra_cols = df.columns.drop([id_col, time_col]).tolist()
    168 if extra_cols:

File c:\\Python\\miniconda3\\envs\\openbb\\Lib\\site-packages\\pandas\\core\\frame.py:5365, in DataFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5346 @doc(
   5347     NDFrame.reindex,
   5348     klass=_shared_doc_kwargs[\"klass\"],
   (...)
   5363     tolerance=None,
   5364 ) -> DataFrame:
-> 5365     return super().reindex(
   5366         labels=labels,
   5367         index=index,
   5368         columns=columns,
   5369         axis=axis,
   5370         method=method,
   5371         copy=copy,
   5372         level=level,
   5373         fill_value=fill_value,
   5374         limit=limit,
   5375         tolerance=tolerance,
   5376     )

File c:\\Python\\miniconda3\\envs\\openbb\\Lib\\site-packages\\pandas\\core\\generic.py:5607, in NDFrame.reindex(self, labels, index, columns, axis, method, copy, level, fill_value, limit, tolerance)
   5604     return self._reindex_multi(axes, copy, fill_value)
   5606 # perform the reindex on the axes
-> 5607 return self._reindex_axes(
   5608     axes, level, limit, tolerance, method, fill_value, copy
   5609 ).__finalize__(self, method=\"reindex\")

File c:\\Python\\miniconda3\\envs\\openbb\\Lib\\site-packages\\pandas\\core\\generic.py:5630, in NDFrame._reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
   5627     continue
   5629 ax = self._get_axis(a)
-> 5630 new_index, indexer = ax.reindex(
   5631     labels, level=level, limit=limit, tolerance=tolerance, method=method
   5632 )
   5634 axis = self._get_axis_number(a)
   5635 obj = obj._reindex_with_indexers(
   5636     {axis: [new_index, indexer]},
   5637     fill_value=fill_value,
   5638     copy=copy,
   5639     allow_dups=False,
   5640 )

File c:\\Python\\miniconda3\\envs\\openbb\\Lib\\site-packages\\pandas\\core\\indexes\\base.py:4426, in Index.reindex(self, target, method, level, limit, tolerance)
   4422     indexer = self.get_indexer(
   4423         target, method=method, limit=limit, tolerance=tolerance
   4424     )
   4425 elif self._is_multi:
-> 4426     raise ValueError(\"cannot handle a non-unique multi-index!\")
   4427 elif not self.is_unique:
   4428     # GH#42568
   4429     raise ValueError(\"cannot reindex on an axis with duplicate labels\")

ValueError: cannot handle a non-unique multi-index!"
}
j
Does your dataframe have an index?
v
Yes, it has a RangeIndex
j
Are there duplicate combinations for id and date?
v
There were but I just removed them and the problem still persists...
j
The same error? That seemed to indicate there were duplicates, if you get a different error can you please paste it here?
v
Ok, I will take another look and get back to you
It is still the same error
j
Can you provide a reproducible example? The following works as expected:
Copy code
from utilsforecast.data import generate_series
from utilsforecast.preprocessing import fill_gaps

series = generate_series(2, freq='B').sample(frac=0.5)
fill_gaps(series, freq='B', start='per_serie', end='per_serie')
v
I think I did something wrong while merging two datasets, tried with the pre-merge one now and it worked. I will check, if it still persists will add it here.