Johannes Emme
06/29/2024, 2:46 PMcs_df
and the true target plotted against each other. From this plot, it can be seen that my model is okay at predicting the weekends but has clear difficulties in predicting the Mondays.
However, when I used the model for predictions (see plot 2), the uncertainty for the weekends was very large, and the Mondays had small uncertainty. (In the plot2 I have forgotten legends: black = true, blue = mean prediction, purple = 10th and 90th percentiles)
What I have come to realize is that the problem arises from a misalignment between the conformal horizon and the horizon of when I am predicting. With a conformal horizon of 96, the errors collected for a specific timestep are not “belonging to the same timeslot.” For instance, the first error in the first window corresponds to Monday 00:00, while for the next window, the first hour is Friday 00:00, then Tuesday 00:00, and so on. Hence, when I predict the consumption during Saturday, the quantiles are based on several different days and hours and not “Saturday hour errors.”
To overcome this issue, I set the conformal horizon to 24*7 (168) so that my conformal windows start with the same day as when I am predicting. Then I get the following result (see plot 3 and 4), where the uncertainty is low for the weekends and high for the Mondays. However, I do not believe this is a sustainable solution. Unfortunately, I don't have a very great alternative either. Currently, I have simply for my case rewritten the _add_conformal_distribution_intervals
function by:
1. Requiring that n_windows*h >= 168 to have all hours in the week represented.
2. Joining the cs_df
and fcst_df
on day_of_week
and hour
.
3. Subtracting and adding the mean to get a distribution around each hour, and then calculating the quantiles
I am very curious to hear your thoughts on this.
Best regards,
JohannesJosé Morales
07/01/2024, 6:06 PM