Abel Kempynck
04/26/2024, 12:14 PMfutr_df
.Marco
04/26/2024, 12:48 PMfutr_df
has the timestamps of the full day of 02-01-2023 and does the unique_id
column matches the one used in training?Abel Kempynck
04/26/2024, 1:42 PMAbel Kempynck
04/26/2024, 1:44 PMMarco
04/26/2024, 2:05 PMAbel Kempynck
04/26/2024, 2:07 PMAbel Kempynck
04/26/2024, 2:07 PMMarco
04/26/2024, 3:23 PMAbel Kempynck
04/27/2024, 6:46 AMAbel Kempynck
04/27/2024, 8:42 AMValueError: There are missing combinations of ids and times in `futr_df`.
You can run the `make_future_dataframe()` method to get the expected combinations or the `get_missing_future(futr_df)` method to get the missing combinations.
Abel Kempynck
04/27/2024, 3:04 PMAbel Kempynck
04/28/2024, 8:57 AMMarco
04/29/2024, 12:58 PMAbel Kempynck
04/29/2024, 1:50 PMMarco
04/29/2024, 2:09 PMbatch_size
and valid_batch_size
.Giovanni Perri
06/12/2024, 2:26 PMmake_future_dataframe()
method to get the expected combinations or the get_missing_future(futr_df)
method to get the missing combinations.) only in the multivariate case. In the univariate one, I can do a simply loop to predict the whole year, then calculating the error (it seems working):
horizon = 24
past_days = 1
y_test = dataset_input.loc[dataset_input['Time_data'] >= '2023-01-01']
y_test = y_test.query('unique_id in @uids').reset_index(drop=True)
daily_test= y_test.groupby(y_test['Time_data'].dt.date)
y_test_list= []
y_pred_list = []
for date, day in daily_test:
y_test_daily = day.copy()
y_test_list.append(y_test_daily)
for i in range(past_days*horizon, len(y_test), horizon):
y_test_block = y_test.iloc[i-past_days*horizon:i]
y_pred = nf.predict(y_test_block)
y_pred_list.append(y_pred)
But in the multivariate case, (i report only a snip of code)
daily_test= y_test.groupby(y_test['Time_data'].dt.date)
y_test_list= []
y_pred_list = []
futr_exog_list = ['Ora', 'GiornoSettimana', 'holiday', 'SOLAR', 'WIND']
hist_exog_list = ['LOAD', 'GAS', 'PAST_PRICE']
for date, day in daily_test:
y_test_daily = day.copy()
y_test_list.append(y_test_daily)
y_pred = nf_multi.predict(futr_df = y_test_list[0][['unique_id', 'Time_data'] + futr_exog_list])
y_pred_list.append(y_pred)
y_pred = y_pred.reset_index()
I got the same error (ValueError: There are missing combinations of ids and times in futr_df
.
You can run the make_future_dataframe()
method to get the expected combinations or the get_missing_future(futr_df)
method to get the missing combinations), and I'm able only to predict the first day of 2023.
I wanted to use cross validation, but I have some trouble to deeply understand how it works, so I decided to use the simpler fit/predic methods.
Let me know if it's clear. (I have also a simper question about exogenous variables. On the site it's reported that "_There is no need to add the target variable y
and historic variables as they won’t be used by the model."_ So the model will only use that features, ignoring his past?
Also, can i use crossvalidation with multivariate dataset?)Marco
06/12/2024, 2:53 PMGiovanni Perri
06/12/2024, 3:03 PMMarco
06/12/2024, 4:01 PMGiovanni Perri
06/14/2024, 9:48 AMhorizon = 24 # one day forecast
past_days = 1
A. y_df = the whole dataset (including 6 month of 2024)
nf_preds = nf.cross_validation(df=y_df, val_size = 24*345, test_size = 24*365/2 (6months), id_col='unique_id', verbose=True, refit=False)
B. y_df = dataset until 12-31-2023
nf_preds = nf.cross_validation(df=y_df, step_size = horizon, n_windows=365, id_col='unique_id', verbose=True, refit=False)
and then test the model using nf.predict on the 2024 dataset (and here the question: how can I do a test on 6 months without looping)
I've obviusly read your tutorial, but still I don't properly get how it's done. Thanks for your pacience!!!Marco
06/14/2024, 1:29 PM