Nasreddine D
05/14/2023, 6:29 PMfede (nixtla) (they/them)
05/17/2023, 10:10 PMcross_validation
method to optimize the hyperparameters and features considering different windows.Nasreddine D
05/22/2023, 9:00 AMAlpha
hyperparameter of Lasso Model with a cross_validation
3. This best model with best alpha, will automatically select best features (put a 0 coefficient on "useless" features)
4. Then I can train a LGBM model with the best features selected from above steps.
When I performed the first 3 steps, the model selected the best alpha, but all features coefficient were at 0 (so basically no feature selected). Please find below the I used.
import optuna
from sklearn.linear_model import Lasso
import numpy as np
# Hyperparameter tuning
def objective(trial):
alpha = trial.suggest_float('alpha', 0.00001, 1, log=True)
models = [Lasso(alpha=alpha, random_state=0, max_iter=5000)]
mlf = MLForecast(
models = models,
freq = 1,
target_transforms=[Differences([12]), StandardScaler()],
lags=np.arange(1,37),
lag_transforms={
1: [(rolling_mean, 12), (rolling_max, 12), (rolling_min, 12)],
1: [(rolling_mean, 24), (rolling_max, 24), (rolling_min, 24)],
1: [(rolling_mean, 6), (rolling_max, 6), (rolling_min, 6)],
2: [(rolling_mean, 12), (rolling_max, 12), (rolling_min, 12)],
2: [(rolling_mean, 24), (rolling_max, 24), (rolling_min, 24)],
2: [(rolling_mean, 6), (rolling_max, 6), (rolling_min, 6)],
3: [(rolling_mean, 12), (rolling_max, 12), (rolling_min, 12)],
3: [(rolling_mean, 24), (rolling_max, 24), (rolling_min, 24)],
3: [(rolling_mean, 6), (rolling_max, 6), (rolling_min, 6)],
6: [(rolling_mean, 12), (rolling_max, 12), (rolling_min, 12)],
6: [(rolling_mean, 24), (rolling_max, 24), (rolling_min, 24)],
6: [(rolling_mean, 6), (rolling_max, 6), (rolling_min, 6)],
12: [(rolling_mean, 12), (rolling_max, 12), (rolling_min, 12)],
12: [(rolling_mean, 6), (rolling_max, 6), (rolling_min, 6)],
)
crossvalidation_df = mlf.cross_validation(
data=Y_ts,
window_size=24,
n_windows=30,
step_size=1
)
cv_rmse = crossvalidation_df.groupby(['cutoff']).apply(lambda x: rmse(x['y'].values, x["Lasso"].values)).to_frame().mean()
return cv_rmse
study_lasso = optuna.create_study(direction='minimize')
study_lasso.optimize(objective, n_trials=50)
mlf_lasso.models_["Lasso"].coef_