Slackbot
03/17/2023, 1:12 AMMax (Nixtla)
03/17/2023, 1:08 PMJosé Morales
03/17/2023, 5:10 PMArsa Nikzad
03/17/2023, 6:09 PMrolling_std
when we have a set of consecutive zeros larger than window size in target. the root cause of issue seems to be _rolling_std
in window_ops.rolling
where it generates large negative numbers in above situation and these negative numbers are then converted to NAN. attached is an example.
data = pd.DataFrame({
'date': pd.date_range(start='2019-01-01', end='2020-12-31', freq='MS'),
'sprid': 1.,
'target': [1., 2., 0., 4., 0., 0., 0., 0., 9., 10., 11., 12] * 2
})
models = [lgb.LGBMRegressor(**{})]
fcst = MLForecast(
models=models,
freq='MS',
lags=[1],
lag_transforms={
1: [(rolling_std, 3)]
}
)
preprocessed_df = fcst.preprocess(data, id_col='sprid', time_col='date', target_col='target', dropna=False)
print(preprocessed_df)
## check _rolling_std
from window_ops.rolling import _rolling_std
a = np.array([1, 2, 0, 4, 0, 0, 0, 0, 9, 10, 11, 12] * 2)
print(_rolling_std(a, 3))
José Morales
03/17/2023, 6:33 PMArsa Nikzad
03/17/2023, 6:36 PMrolling_std
should generate zeros instead of NAN for these cases.Max (Nixtla)
03/17/2023, 6:37 PMArsa Nikzad
03/17/2023, 6:38 PMJosé Morales
03/22/2023, 3:36 AMArsa Nikzad
03/22/2023, 1:08 PM