def fugueReoncile(Y_hat_df: pd.DataFrame) -> pd...
# general
m
def fugueReoncile(Y_hat_df: pd.DataFrame) -> pd.DataFrame: return hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_train_df, S=S_df, tags=tags) Y_rec_df = transform(Y_hat_df.reset_index(), fugueReoncile, params={}, schema=fugueSchema, partition={"by": "unique_id"}, engine=spark) should be fine, yeah?
k
what are you trying to do here? use hierarchical forecasting on Spark?
does that work? I don’t think it will 😅
m
@Kevin Kho yes 😄
k
Oh really? wow. Are the forecasts reconciled fine?
m
i doubt it
im guessing it'll only reconcile with the data within that partition
we'll see 😬
any recommendations? i get OOM errors if i try on just the driver node
k
that is my expectation too. so with hierarchical forecasting, you can partition one level down
and run distributedly like that, it won’t be perfect but it will speed things up
so if you can group your data somehow and run multiple hierarchical forecasting jobs
m
and then a reconciliation for every 2 levels?
k
it might be hard. more of….you just get good forecasts for each hierarchical forecasting job 😂. not sure if you can reconcile those distributedly. i think it boils down to the same problem you have now
m
lol ok, ill keep experimenting
I don't even think my dataset is that large to be honest. Other people just use one giant compute?
k
yeah it’s certainly limited
btw are you free to chat? we can speed this up
m
yeah lets do that, u mind if i bring in my statistician?
k
no for sure. i’ll dm a google meet link
m
thx
m
yeah lets do that, u mind if i bring in my statistician?
Always bring your statistician! lol
m
@Max (Nixtla) we had a good call, we have some hypothesis if this is going to work or not xD
curious what you think, based on the little info you have. Using fugue.transform, will the reconciliation work when it's partitioned across multiple workers?
@Kevin Kho go big or go home
k
😂
m
still running
not sure if that's good news or bad news
k
oh man i think that’s too long
m
This is probably not going to work yet, currently HierachicalForecast doest no have a native distribution strategy