This message was deleted Nixtla Community #neural-forecast

Join Slack

This message was deleted.

# neural-forecast

Slackbot

12/11/2023, 2:48 PM

This message was deleted.

José Morales

12/11/2023, 5:07 PM

Hey. I think this could be due to missing dates in your training df. Can you verify this with the fill_gaps function? e.g.

Copy code

from utilsforecast.preprocessing import fill_gaps

filled = fill_gaps(HCPCS_Grouped_ts_mlf, start='per_serie', end='per_serie', freq='MS')
assert filled.shape[0] == HCPCS_Grouped_ts_mlf.shape[0]

If this fails it means some dates are missing and you could provide the

filled

df instead (after filling the target missing values)

👀 1

Brian Head

12/11/2023, 5:55 PM

I have

ts2 = fill_gaps(ts2, freq='MS')

in a previous step. Went ahead and ran the code above and it ran successfully. I didn't have the start and stop in mine, but tested and I get the same shape either way.

José Morales

12/11/2023, 5:56 PM

Are you using 1.6.4?

Brian Head

12/11/2023, 5:57 PM

Copy code

Name: neuralforecast
Version: 1.6.4
Summary: Time series forecasting suite using deep learning models
Home-page: <https://github.com/Nixtla/neuralforecast/>
Author: Nixtla
Author-email: <mailto:business@nixtla.io|business@nixtla.io>
License: Apache Software License 2.0
Location: c:\programdata\miniconda3\envs\py310env\lib\site-packages
Requires: numba, numpy, optuna, pandas, pytorch-lightning, ray, torch, utilsforecast
Required-by: 
Note: you may need to restart the kernel to use updated packages.

Brian Head

12/11/2023, 5:57 PM

this is what pip show provides

José Morales

12/11/2023, 6:00 PM

Do any of your series have less than 4 samples? I think maybe the val_size=3 could be removing some series

Brian Head

12/11/2023, 6:01 PM

Smallest number of observations in any of the series is 48

Brian Head

12/11/2023, 6:01 PM

Is that what you meant/

Brian Head

12/11/2023, 6:02 PM

I also tried dropping val_size to 1 and still get same issue

José Morales

12/11/2023, 6:05 PM

The error happens here, right?

Brian Head

12/11/2023, 6:11 PM

I may not be following what you mean. That link takes me to around line 614 fo the core code. I'm not sure that's it or not. The problem arises with this

forecasts_nf_df_fits = nf.predict_insample(step_size=1)

of my code.

José Morales

12/11/2023, 6:12 PM

but you should see that line in your stacktrace. Can you paste it here?

Brian Head

12/11/2023, 6:19 PM

Copy code

ValueError                                Traceback (most recent call last)
Cell In[182], line 1
----> 1 forecasts_nf_df_fits = nf.predict_insample(step_size=1)

File C:\ProgramData\miniconda3\envs\py310env\lib\site-packages\neuralforecast\core.py:622, in NeuralForecast.predict_insample(self, step_size)
    620 # Append predictions in memory placeholder
    621 output_length = len(model.loss.output_names)
--> 622 fcsts[:, col_idx : (col_idx + output_length)] = model_fcsts
    623 col_idx += output_length
    624 model.set_test_size(test_size=test_size)  # Set original test_size

ValueError: could not broadcast input array from shape (159510,9) into shape (158616,9)

Brian Head

12/11/2023, 6:19 PM

Looks like you were correct

José Morales

12/11/2023, 6:20 PM

They're different numbers now. Is it because of the val_size?

José Morales

12/11/2023, 6:21 PM

Pinging @Cristian (Nixtla) in case you may have an idea of what's going on because apart from the dates I don't know where this shape mismatch could be coming from

Brian Head

12/11/2023, 6:23 PM

The initial size from my initial post was the full dataframe. I subsampled 10% to speed up for testing. I've also played with the val_size to see if it mattered. I now have it back to val_size=3.

José Morales

12/11/2023, 6:27 PM

But the numbers changed a bit (not 90% less) • original msg: (159120,9) into shape (158232,9) • latest err : (159510,9) into shape (158616,9) I'm just trying to figure out where the difference comes from, that might help us track where the mismatch happens

José Morales

12/11/2023, 6:27 PM

Do you have 130 ids?

Brian Head

12/11/2023, 6:32 PM

Thanks, @José Morales. Appreciate all of the help. I'll get back with you later this afternoon with that. I got pulled into a meeting.

Cristian (Nixtla)

12/11/2023, 7:57 PM

Hi @Brian Head. For some reason the model is returning more forecasts than needed, but I don't know why is the case yet. To pinpoint the issue I suggest simplifying your pipeline to the absolute minimum, and then start adding components, and see where it fails. For example: 1. Start only with the NHITS, no historic variables nor future variables. Maybe even only keeping 1 time series. 2. Add all series. 3. Add exogenous covariates. 4. Include more models. And let us know where it fails.

Brian Head

12/11/2023, 8:03 PM

Thank you for that suggestion, @Cristian (Nixtla). I will work through that. @José Morales it looks like the difference in numbers is because I updated my code to use the

filled = fill_gaps(HCPCS_Grouped_ts_mlf, start='per_serie', end='per_serie', freq='MS')

which returns the different DF shape. I was following this. Now I'm wondering if the first part of what you sent (including the start and end for the fill_gaps function) was only for testing that or if I should use (e.g., which should I use)? I still get the error either way, but want to make sure I'm using getting the right gaps filled. BTW, I also checked and it is filling gaps with both approaches, but more with the version you provided.

José Morales

12/11/2023, 8:45 PM

It depends, sometimes you want the series to start at the same time, or end at the same time. In this case we want to keep the boundaries but make sure there aren't any missing dates between the start and end. Also make sure you fill the target with appropriate values, since that function just includes the missing rows with NaN in the target

Brian Head

12/11/2023, 8:51 PM

Gotcha. Yeah, I have a mix of start dates. They should all have the same end date. So, in that case would I used only the end? I don't see this covered in the documentation, but maybe I missed it. And, when I had this working it was actually on a different, albeit related dataframe, so that might also be why I haven't experienced this problem with any of SF, MLF, or NF yet--maybe there's something going on in the data with these that wasn't with the other DF. I am filling the gaps created with

fill_gaps

by using a .fillna after.

José Morales

12/11/2023, 8:56 PM

I believe all cases are covered here. If you want the same end you can set

end='global'

👍 1

José Morales

12/11/2023, 9:17 PM

Can you try with the dev version of neuralforecast? We recently changed that function and that could fix the error.

pip install git+<https://github.com/nixtla/neuralforecast.git|https://github.com/nixtla/neuralforecast.git>

José Morales

12/11/2023, 9:20 PM

Otherwise it may be more efficient if you can just give us the sizes of your series so that we try to replicate it on our side

Brian Head

12/12/2023, 2:45 PM

Following your (@Cristian (Nixtla) & @José Morales) advice, I think I've narrowed the issue down to what is happening with the

fill_gaps

function prior to model fit and predictions. In a previous run (using similar data on a slightly older version of Nixtla packages) I used the default setting with

fill_gaps

not realizing there were options for the start and end dates of filling. That worked correctly. However, it wasn't working now, so I played with the options for start and end. The only way any of the models I've tried (e.g., NBEATS, NBEATSx, RNN, DilatedRNN, NHITS, LSTM, MLP) will successfully run the

predict_insample

is when I set both

start

and

end

global

. However, this produces odd results for some of the series that have a later start date--at the beginning of the series they have a major spike in the insample predictions when nothing actually occured there. Is there any workaround for this? Note: Here are the current versions of packages I'm using • Neuralforecast 1.6.4 • Statsforecast 1.6.0 - for deriving season and trend • Utilsforecast 0.0.21

José Morales

12/12/2023, 3:58 PM

Were you able to install neuralforecast from github? We're not sure if it's a bug in the previous implementation or in the development one

José Morales

12/12/2023, 4:12 PM

Nevermind, the error is in both versions, we're looking into it. Thanks for reporting it!

Brian Head

12/12/2023, 4:13 PM

Great. Thanks, @José Morales!

7 Views

Open in Slack

Previous Next