Dimitris Floros
02/27/2024, 4:51 PM# Setup the NN model
class NN(BaseEstimator):
def fit(self, X, y):
print(X.columns)
return self
def predict(self, X):
print(X.columns)
return X['lag1'] # dummy for now
Davide Cangelosi
03/05/2024, 10:45 AMDavide Cangelosi
03/05/2024, 10:47 AMDavide Cangelosi
03/05/2024, 10:48 AMjan rathfelder
03/05/2024, 8:17 PMjan rathfelder
03/07/2024, 10:59 AMKrystian W.
03/09/2024, 6:56 PMRyan Grosskopf
03/11/2024, 7:00 PMpredict(ids=XXX)
parameter with the same model on the same machine, it takes about 25s. I was expecting prediction to scale roughly linearly with a single series prediction in this case taking <100 ms as I've seen using DARTS' LightGBM prediction implementation. Any idea where I'm going wrong? 0.5s per prediction step isn't really usable. More details in đ§”...Ryan Grosskopf
03/11/2024, 8:44 PMMLForecast
class as something to be used in a moderate performance production environment with its .update
and new_data
features or was the intent to have it be more of a development aid? I've not brought a LightGBM time series model into production before so I'm trying to decide how much to lean on the work you've done versus rolling my own. Any thoughts or resources would be very much appreciated.jan rathfelder
03/12/2024, 8:38 AMVĂtor Barbosa
03/21/2024, 12:56 AMimport pandas as pd
# Dataset source: <https://www.kaggle.com/datasets/felsal/ibovespa-stocks?resource=download>
ibov = pd.read_csv('../datasets/ibovespa/archive/b3_stocks_1994_2020.csv', parse_dates=['datetime'])
from mlforecast import MLForecast
from sklearn.linear_model import LinearRegression
# Freq values: <https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases>
def train_test_split(df:pd.DataFrame, size_test=0.3, sort_by_col='datetime'):
df_sorted = df.sort_values(by=sort_by_col)
split = int(len(df)*size_test)
return df_sorted.iloc[:-split], df_sorted.iloc[-split:]
forecast_ml = MLForecast(models=[LinearRegression()], lags=[5], freq='B')
# How to use multivariate targets? Only using close for now
ibov_train, ibov_test = train_test_split(ibov, 0.3)
forecast_ml.fit(ibov_train, id_col='ticker', time_col='datetime', target_col='close')
# Predict close values for next 5 days
preds = forecast_ml.predict(h=5,new_df=ibov_test)
Andreas Kaae
03/21/2024, 11:47 AMSeasonalRolling
. I have a hard time understanding excatly what it does and how it is linked to the season_length
and the window_size
parameters - can anyone help clarify this or link to some more indepth explanations đMakarand Batchu
03/21/2024, 12:20 PMlgb_params = {
'verbosity': -1,
'num_leaves': 512,
}
models = {
'KNeighborsRegressor': KNeighborsRegressor(),
'Lasso': Lasso(),
'LinearRegression': LinearRegression(),
'MLPRegressor': MLPRegressor(),
'Ridge': Ridge(),
'DT': DecisionTreeRegressor(),
'avg': lgb.LGBMRegressor(**lgb_params),
'q75': lgb.LGBMRegressor(**lgb_params, objective='quantile', alpha=0.75),
'q25': lgb.LGBMRegressor(**lgb_params, objective='quantile', alpha=0.25)
}
Is there a way to hyper-parameter tune these individual models using MLForecast or any other offering by Nixtla?
I do understand that there is the option of using traditional approaches like gridsearchcv etc but the process would become very complicated because I have multiple unique_ids for which I want to forecast.
Thanks in advance!Andreas Kaae
03/25/2024, 12:59 PMMLForecast
library and have arrived at implementing the usage of PredictionIntervals
- however, I am unsure about the n_windows
parameters which represents the number of cross-validation windows used to calibrate the intervals. Is there anywhere I can find a more elaborate explanation of this? Also I am trying to find the source code for this implemenation, any one who can point me towards this?VĂtor Barbosa
03/25/2024, 9:35 PMMLForecast
support historic covariates? If so, how can we supply it? I can only find mention and explanation for static and future covariates.Andreas Kaae
03/26/2024, 11:47 AM.update()
not working:
Hello I have the following example:
from datasetsforecast.m4 import M4
import random
await M4.async_download('data', group='Hourly')
df, *_ = M4.load('data', 'Hourly')
uids = df['unique_id'].unique()
random.seed(0)
sample_uids = random.choices(uids, k=2)
df = df[df['unique_id'].isin(sample_uids)].reset_index(drop=True)
df['ds'] = df['ds'].astype('int64')
fcst = MLForecast(
models=LGBMRegressor(),
freq=1,
)
df_train_M4 = df[df["ds"] <= 1000]
fcst.fit(df_train_M4)
fcst.predict(4)
When I then hereafter try to use the update functionality I get `AttributeError: 'MLForecast' object has no attribute 'update'`:
new_values = df[df["ds"] > 1000]
fcst.update(new_values)
Can anyone see what I'm doing wrong? I am using: mlforecast.__version__ = '0.12.0'
Andreas Kaae
03/27/2024, 10:03 AMlightgbm
library gives you access to such as the feature_importances_
, the number of boosting iterations performed n_iter_
etc.? I am aware of the features_order_
in mlforecast
but this only list the features used for training.Makarand Batchu
03/28/2024, 11:15 AMPredictionIntervals
for my mlforecast model to get prediction intervals. Can someone please explain how to choose the h
and n_windows
values since it seems to be different from cross-validation?
I am having a training dataset of 63 days and wish to predict for a horizon on 31 days. I am using the below models and when I set prediction_intervals=PredictionIntervals(h=10, n_windows=2)
, .fit()
fails with an error that
ValueError: Found array with 0 sample(s) (shape=(0, 6)) while a minimum of 1 is required by KNeighborsRegressor.
models = {
'KNeighborsRegressor': KNeighborsRegressor(),
'Lasso': Lasso(),
'LinearRegression': LinearRegression(),
'MLPRegressor': MLPRegressor(),
'Ridge': Ridge(),
'DT': DecisionTreeRegressor(),
'avg': lgb.LGBMRegressor(**lgb_params),
'q75': lgb.LGBMRegressor(**lgb_params, objective='quantile', alpha=0.75),
'q25': lgb.LGBMRegressor(**lgb_params, objective='quantile', alpha=0.25)
}
Andreas Kaae
04/02/2024, 12:50 PMfrom mlforecast.distributed import DistributedMLForecast
and from mlforecast.distributed.models.spark.lgb import SparkLGBMForecast
. I have installed the library on the cluster I am using (I am using Databricks) but I get an ImportError
and I cannot figure out if I am using the library wrong or something is wrong with my cluster.
ImportError: cannot import name 'DistributedMLForecast' from 'mlforecast.distributed' (/local_disk0/.ephemeral_nfs/cluster_libraries/python/lib/python3.10/site-packages/mlforecast/distributed/__init__.py)
Guillaume GALIE
04/02/2024, 2:15 PMVĂtor Barbosa
04/02/2024, 9:38 PMBrian Head
04/04/2024, 7:20 PMtransform_exog
function within the MLForecast
function like you do lag_transforms
? So far my efforts haven't worked and when I do it before and preprocess I have issues when I get to the predict
step where it expects me to supply the exogenous variables. I do have other exogenous variables I am supplying, but I'm also using the transform_exog
on them. Hope this makes sense.
The documentation doesn't seem to show it that way--only preprocessing series data. I was able to use this to figure out how to do it with preprocessing mentioned above, but hoping there's something I've missed to do it within the MLForecast
function.Andreas Kaae
04/08/2024, 12:46 PMhyperopt
in combination with MLForecast
?
I have tried doing some smaller experiments using hyperopt
but it fails as the library does not have access to the loss function.VĂtor Barbosa
04/08/2024, 8:28 PMhorizon = 12
models = [LinearRegression(), BayesianRidge(), lgb.LGBMRegressor(verbosity=-1)] #xgb.XGBRegressor(verbosity=0)
forecast_ml = MLForecast(models=models,
lags=range(1, horizon+1),
freq='B')
See attached the output and cross validation. I have also added the cross-validation for the same settings on darts.
Now if I use the differences transform it looks better, but quite noisy:
forecast_ml = MLForecast(models=models,
lags=range(1, horizon+1),
lag_transforms={
1: [ExpandingMean()],
horizon: [RollingMean(window_size=horizon)],
},
freq='B',
target_transforms=[Differences([1,horizon])])
Any tips?victor michon
04/10/2024, 5:34 PMVĂtor Barbosa
04/14/2024, 6:05 PMDinis Timoteo
04/16/2024, 3:44 PMVidar Ingason
04/16/2024, 5:59 PMjan rathfelder
04/18/2024, 2:17 PMjan rathfelder
04/19/2024, 8:56 PM