https://github.com/nixtla logo
m

Mark

07/18/2023, 5:14 PM
def fugueReoncile(Y_hat_df: pd.DataFrame) -> pd.DataFrame: return hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_train_df, S=S_df, tags=tags) Y_rec_df = transform(Y_hat_df.reset_index(), fugueReoncile, params={}, schema=fugueSchema, partition={"by": "unique_id"}, engine=spark) should be fine, yeah?
k

Kevin Kho

07/18/2023, 5:21 PM
what are you trying to do here? use hierarchical forecasting on Spark?
does that work? I don’t think it will 😅
m

Mark

07/18/2023, 5:38 PM
@Kevin Kho yes 😄
k

Kevin Kho

07/18/2023, 5:38 PM
Oh really? wow. Are the forecasts reconciled fine?
m

Mark

07/18/2023, 5:38 PM
i doubt it
im guessing it'll only reconcile with the data within that partition
we'll see 😬
any recommendations? i get OOM errors if i try on just the driver node
k

Kevin Kho

07/18/2023, 5:40 PM
that is my expectation too. so with hierarchical forecasting, you can partition one level down
and run distributedly like that, it won’t be perfect but it will speed things up
so if you can group your data somehow and run multiple hierarchical forecasting jobs
m

Mark

07/18/2023, 5:42 PM
and then a reconciliation for every 2 levels?
k

Kevin Kho

07/18/2023, 5:43 PM
it might be hard. more of….you just get good forecasts for each hierarchical forecasting job 😂. not sure if you can reconcile those distributedly. i think it boils down to the same problem you have now
m

Mark

07/18/2023, 5:44 PM
lol ok, ill keep experimenting
I don't even think my dataset is that large to be honest. Other people just use one giant compute?
k

Kevin Kho

07/18/2023, 5:45 PM
yeah it’s certainly limited
btw are you free to chat? we can speed this up
m

Mark

07/18/2023, 5:46 PM
yeah lets do that, u mind if i bring in my statistician?
k

Kevin Kho

07/18/2023, 5:46 PM
no for sure. i’ll dm a google meet link
m

Mark

07/18/2023, 5:47 PM
thx
m

Max (Nixtla)

07/18/2023, 6:33 PM
yeah lets do that, u mind if i bring in my statistician?
Always bring your statistician! lol
m

Mark

07/18/2023, 6:48 PM
@Max (Nixtla) we had a good call, we have some hypothesis if this is going to work or not xD
curious what you think, based on the little info you have. Using fugue.transform, will the reconciliation work when it's partitioned across multiple workers?
@Kevin Kho go big or go home
k

Kevin Kho

07/18/2023, 6:59 PM
😂
m

Mark

07/19/2023, 11:58 AM
still running
not sure if that's good news or bad news
k

Kevin Kho

07/19/2023, 5:11 PM
oh man i think that’s too long
m

Max (Nixtla)

07/19/2023, 9:29 PM
This is probably not going to work yet, currently HierachicalForecast doest no have a native distribution strategy