This message was deleted Nixtla Community #mlforecast

Join Slack

This message was deleted.

# mlforecast

Slackbot

10/06/2023, 2:00 PM

This message was deleted.

José Morales

10/06/2023, 4:02 PM

That's correct. Since they don't change over time they're static and since you set the dtype to category LightGBM will treat them as such when training

Jason Gofford

10/08/2023, 11:32 AM

On the static vs dynamic, it makes sense for something like "store_id" to be static because a store is a store. Repeating a static value is the only way to handle it. But for dynamic features (where my understanding of a dynamic feature is that it's something that varies over time or by date but is not related to the unique_id) wouldn't missingness and holidays be dynamic? A given date can be a holiday or not, and the date can change by year, or it might be missing or not. The convention isn't clear to me.

José Morales

10/09/2023, 3:54 PM

The definition is the following: • Static: same value for a single id across all timestamps • Dynamic: more than one unique value for a single id If you have dynamic features they can also be categorical, the only difference is that you have to provide them through

X_df

when using predict, whereas static features are just repeated automatically for you. Does that make sense?

Jason Gofford

10/09/2023, 4:00 PM

So here, a holiday is dynamic, right? It can be 1 or 0, or many values

José Morales

10/09/2023, 4:06 PM

Yes

Jason Gofford

10/09/2023, 4:10 PM

If that's the case, how should it be passed to training? I get "can't cast to {integer or float}" errors if I dont put it as a "static feature".

José Morales

10/09/2023, 4:11 PM

Do those errors come from the model?

Jason Gofford

10/09/2023, 5:10 PM

I don't think so. It's lightgbm, so it's quite happy with categories. And if they're set as static it's fine, but if they're not then I get an error.

Jason Gofford

10/09/2023, 5:20 PM

I encounter the problem when including a categorical column (

category

type) in an input df, but not setting it as static. It should be relatively easy to reproduce with a basic lightgbm model.

José Morales

10/09/2023, 5:21 PM

Can you paste the stacktrace of the error?

Jason Gofford

10/09/2023, 5:51 PM

I'll reproduce it tomorrow

Jason Gofford

10/12/2023, 12:57 PM

I figured this out. When doing a train vs test comparison I need to pass the

test

dataset (without the static columns) to the predict method. This wasn't clear from the error message at all.

Jason Gofford

10/12/2023, 12:59 PM

The error message is raised as

Copy code

-> 6178 raise KeyError(f"{not_found} not in index")
KeyError: "['holidays'] not in index"

The error could be improved to explicitly state that this is related to the

X_df

set.

José Morales

10/12/2023, 3:04 PM

Thanks for the feedback. Were you not providing

X_df

at all or was it just missing that column?

Jason Gofford

10/12/2023, 3:04 PM

not provided at all, which makes sense in hindsight but puzzled me for a while.

José Morales

10/12/2023, 4:05 PM

I think we can add some more errors in there, I'll work on that

20 Views

Open in Slack

Previous Next