https://github.com/nixtla logo
#general
Title
# general
s

Scottfree Analytics LLC

11/17/2023, 7:41 PM
Question on StatsForecast: I have a Pandas dataframe (did not reset_index) where I want to do an hourly forecast. For example: unique_id, ds, y ABC (111), 2022-11-08 080000, 41.0 ABC (111), 2022-11-08 090000, 28.0 CDE (222), 2022-11-08 080000, 50.0 There are ten unique IDs with about 47K records, and when running sf.forecast(h=forecast_period), it immediately crashes the Python kernel. 1. My first guess is that the datetime format for the hourly prediction is incorrect. 2. Second guess is that the time series could be missing one of the time stamps, for example, 080000 on one of the days, in which case it would have to handle irregular time series. Any immediate observations or thoughts? Thank you!
j

José Morales

11/17/2023, 7:59 PM
Hey. Can you provide the code that you're running? Also, are you using a Mac with the m chips?
s

Scottfree Analytics LLC

11/17/2023, 8:01 PM
Hi Jose, I’m running on Azure Databricks and running StatsForecast 1.6.0.
unique_id is object type, ds is datetime64[ns], and y is float64
j

José Morales

11/17/2023, 8:03 PM
Are you able to see the memory usage? It could be an out of memory error
s

Scottfree Analytics LLC

11/17/2023, 8:04 PM
That’s what I thought at first, but only 47K rows going back a year and then I cut it down to past 31 days of hourly data.
Ah, I think I’ll just try exporting that data and running in a regular Jupyter notebook.
Will follow up with you on actual error, but I think you’re right.
Thank you @José Morales it’s running fine in a regular Jupyter notebook. Databricks is so overrated! 😅
😅 1