This message was deleted Nixtla Community #mlforecast

Join Slack

This message was deleted.

# mlforecast

Slackbot

10/03/2023, 2:07 PM

This message was deleted.

José Morales

10/03/2023, 3:27 PM

Hey. Each serie must have more than n_windows * h samples. If you're using

dropna=True

this has to be true after computing the lag features and dropping the nulls that they produce, so for example if you're using lag 5 you need to have n_windows * h + 5 samples in each serie. Does that clarify the behavior?

José Morales

10/03/2023, 3:28 PM

If you're unsure about how many rows the dropna removes you can create an MLForecast object with the same features and run preprocess on a single serie

Jason Gofford

10/03/2023, 3:53 PM

yeah, makes sense 👍

José Morales

10/03/2023, 6:22 PM

We've been thinking of having kind of QA functions. What would be most helpful for you in this case? We'd thought of returning the ids of the series that aren't long enough. Would including the number of samples they're missing be helpful?

Jason Gofford

10/03/2023, 6:27 PM

That would work. Getting the id's to drop as a pre-CV step would be useful, or perhaps having the CV procedure exclude them (with warning) as part of the fitting. Or simply having the error be something along the lines of "The follow series have fewer than the required N samples: [,,,]" would be enough I think?

Jason Gofford

10/05/2023, 9:52 PM

👋 revisiting this but are there any recommended strategies for dealing with portfolios of time series of different lengths? In a simplified case let's say I am a retailer with three products, one that's been on sale for 52 weeks and another that launched 12 weeks ago, and another that launched 1 week ago. One of the benefits of a global model is that you can learn expected behaviours for the newer product from the older ones (overcoming the coke start problem), but at the moment if I try and cross validate in this scenario I get a "series too short" error.

José Morales

10/05/2023, 10:24 PM

What would you like to happen in this case? Drop the series that are too short and just CV with the rest?

👍 1

Jason Gofford

10/06/2023, 5:02 AM

Yeah. I think that would be the desired behaviour. Throw a warning at each step etc, but ultimately don't prevent the evaluation.

Jason Gofford

10/06/2023, 5:23 AM

That's how I've handled it when I've built a similar model in the past anyway, and it works. Ultimately the objective of a back-test (with refit) is that I want to see how the model config would have behaved at that point in time, with known information at that point. A product that launched last week shouldn't prevent a CV fit on data 12 weeks ago because 12 weeks ago that product didn't exist. I can probably work around it with a

for

loop for now, but it would be nice if it was implicit. Being able to handle cold-start forecasting would be a use-case worthy of noting on the website with an explicit example too.

👍 1

8 Views

Open in Slack

Previous Next