Hi :slightly_smiling_face: I'm trying to better un...
# mlforecast
j
Hi 🙂 I'm trying to better understand how the "update" functionality works. I have an hourly forecasting model trained up until 2024-01-01 00:00, with a maximum forecast horizon of 96 hours and using lagged features. When I update the model with new data (using the columns:
time_col
,
id_col
, and
target_col
, which in my case are
dt_hour_local
,
segment
, and
total_consumption_kwh
), for example with the following record
[2024-03-10 00:00, segmentA, 1.000]
I’m able to forecast starting from this updated time step. My questions are: 1. Why is this possible even though there’s a significant gap between the last training point (2024-01-01) and the update point (2024-03-10)? 2. What happens to the missing data between these two dates (2024-01-01 and 2024-03-10)? Is the model performing some kind of internal imputation or rolling forecast for those missing values before forecasting the future? Here is a depiction of what happens if you are not cautious about the model update: Blue = true Orange = prediction result when forecasting with "weird" model update - gap between last date seen by the model (model.ts.last_dates[0]) and the forecast start Green = result of correct update
j
Hey. What happens is that the new value (
1.0
) is appended to the historic targets and the last observed date is set to
2024-03-10
, so the forecasts start from that point. That's wrong though, since the lag features won't be computed correctly, there's currently no validation on the dates that are provided (issue)
👍 1
j
Ah I see, thank you very much!