Anthony Giorgio
02/25/2025, 11:48 AMunique_id
in the new dataframe (as seen in fcst.ts.target_transforms[0].scaler_.stats_
), the same does not happen for fcst.ts.uids
.
In the provided example, after fitting the model on the M3 dataset and then applying it to the M4 dataset, fcst.ts.uids
still contains the unique_id values from M3, instead of the updated values from M4.
import lightgbm as lgb
import numpy as np
import pandas as pd
from datasetsforecast.m3 import M3
from datasetsforecast.m4 import M4
from mlforecast import MLForecast
from mlforecast.target_transforms import *
Y_df_M3, _, _ = M3.load(directory='./', group='Monthly')
Y_df_M4, _, _ = M4.load(directory='./', group='Monthly')
Y_df_M4['ds'] = pd.to_datetime(Y_df_M4['ds'])
models = [lgb.LGBMRegressor(verbosity=-1)]
fcst = MLForecast(
models=models,
lags=range(1, 13),
freq='MS',
target_transforms=[LocalStandardScaler()],
)
fcst.fit(Y_df_M3);
print('total M3 unique_id: ', len(Y_df_M3['unique_id'].unique()))
print('total uids before transfer learning: ', len(fcst.ts.uids))
print('scaler len before transfer learning: ', len(fcst.ts.target_transforms[0].scaler_.stats_))
Y_hat_df = fcst.predict(h=12, new_df=Y_df_M4)
print('total M4 unique_id: ', len(Y_df_M4['unique_id'].unique()))
print('total uids after transfer learning: ', len(fcst.ts.uids))
print('scaler len after transfer learning: ', len(fcst.ts.target_transforms[0].scaler_.stats_))
total M3 unique_id: 1428
total uids before transfer learning: 1428
scaler len before transfer learning: 1428
total M4 unique_id: 48000
total uids after transfer learning: 1428
scaler len after transfer learning: 48000
It would be useful to have fcst.ts.uids
updated to reflect the new unique_id values. This is particularly important for correctly retrieving scaler values when performing an inverse transform on SHAP values for the new predictions, as shown below:
# Create dictionary for stdscaler
scaler_dict = {
unique_id: [
scaler_stats[0], # Mean
scaler_stats[1], # Std deviation
]
for unique_id, scaler_stats in zip(
fcst.ts.uids, # Still contains M3 unique_ids
fcst.ts.target_transforms[0].scaler_.stats_,
)
}
I have opened an issue here: https://github.com/Nixtla/mlforecast/issues/484jan rathfelder
02/25/2025, 12:13 PMJosé Morales
02/25/2025, 2:32 PMAnthony Giorgio
02/25/2025, 2:47 PMjan rathfelder
02/25/2025, 2:53 PMAnthony Giorgio
02/25/2025, 2:54 PMAnthony Giorgio
02/25/2025, 2:55 PMJosé Morales
02/25/2025, 2:57 PMAnthony Giorgio
02/25/2025, 3:06 PMJosé Morales
02/25/2025, 3:08 PMAnthony Giorgio
02/25/2025, 3:16 PM