Nixtla's Open Source Time Series Ecosystem.

Nixtla Community

Hi! Thanks, could you please create an issue asking for an example of `predict_insample`  that would help us to keep it on the backlog :slightly_smiling_face:

I think a lot of users would like that example :slightly_smiling_face:

Hi <@U056874JKHD>. I see 3 bugs in the code:
1. You have defined `scaler_type` two times
2. If you want to use 5 stacks, you should also modify `stack_types` to 5*["identity"]
3. Some `n_freq_downsample` entries have 3 components, they should all have 5 (it has to match number of stacks).

Hi <@U056R3R3DFY>. Our implementation also allows for extracting variable importance. We dont have an equivalent  `interpret_output` method yet. It is currently in our backlog and we will implement it soon!

Hi Chris! Yes, TFT is an intrinsically slow model. We analyzed the cost of each component and the model itself accounts for almost all the computation (not the loader). For each batch it has to unroll the LSTMs and then it has the attention layer, with quadratic cost on the size of the window).

There are many hyperparameters you can tweak to reduce the cost/time:
• Reduce `windows_batch_size`  (less windows per batch)
• Reduce `input_size`  (shorter input window)
• Reduce `hidden_size`  (smaller model)

If you have a validation set (set `val_size&gt;0`), you can also set `early_stop_patience_steps` larger than 0, to stop the training if the validation loss is not improving.

Hey Cristian, thanks for running this to ground. We'll try these suggestions and let you know how it goes. lol TFT looks like a snail next to TCN :smile:

It is, TFT is a WindowsBased approach
TCN uses the forking sequences optimization

They have a huge tradeoff between GPU memory and computational speed

If we have some time we can develop a Transformer based algorithm that uses forking sequences

Hey <@U05778P619D>,

Thanks for using our library. It seems to me that you are looking for the `cross_validation` method.
• <https://nixtla.github.io/neuralforecast/core.html#neuralforecast.cross_validation>
• <https://nixtla.github.io/statsforecast/examples/crossvalidation.html>

Thanks for the response <@U038ZT9UDGA>!
Im not sure thats what im looking for though -- I want to use the model over many days in the future, given that Ive trained it at some point in the past.

So imagine today i train a model using historical data. and then over the next month, I want to make daily predictions.

Is cross_validation still the recommended tool for that?

Save and Load?
<https://nixtla.github.io/neuralforecast/examples/save_load_models.html>

Hi <@U05778P619D>. Can you give us more details of the use case? Here are three different cases:
• Today you want to make the forecast for the next 30 days.  Without waiting for the newer data you will need to recursively produce forecasts (not recommended, and not supported), using one forecast as input for the next day.
• Make 1 daily prediction each day during the next month, using the latest data as each day finishes. You can save the model and load it every new day as Kin mentioned.
• Simulate historic forecasts for the last 30 days (you already have the real data). In this case you can use `cross_validation`.
Does this help? Which use case are you considering?

Thanks Kin / Chistian! The use-case I'm going for is (2). My problem isn't the save / load -- but rather if I train up to, say, may 10th, predicting may 11th works fine, but if I call:
```nf.predict(futr_df=to_pred)```
where `to_pred`  is a dataframe with a `ds`  column with ds = May 12th and may 12ths Exogenous Variables, the returning DF will have a ds column for may 11th.

The predictions seem reasonable, so I'm just hacking around this by changing the datatimes returned, but I took it as a signal that maybe I'm doing something wrong.

I can try to come up with a simple repro tomorrow if helpful

thanks for the additional details. The problem is that the predict function does not "update" the stored dataset, so by default it can only predict the immediate  values after the train set. However, you can pass the new data each time using the `df` parameter of the `predict` function and the dates will match!

You need to pass the latest data plus all the historic information you want to use. The model will only use the information from the new `df`  to predict the future values after `df` (and use `futr_df` for the future exogenous variables)

ahhh okay thank you I'll test that out tomorrow! Maybe I'm using Future exogenous variables incorrectly then. If I'm trying to predict may 12, I was having the forecast variables (say a generation forecast) for may 12 be set in a dataframe with `ds ==  may 11`  and trying to pass that into futur_df, but I thinkk what you're saying is it should be set for `ds == may 12`, and I should pass that as `futr_df` and the rest of the historical data in as `df`

yes, all the information must match their actual dates. So if you have a forecast for may 12 at 6pm, use that datestamp. Also note that the `futr_df` has to have the exact same variables used for training, and can only have 1 forecast horizon (24 hs for example). And lastly, should have the immediate values after the database in  `df` .

but dont forget to pass the data in `df` as well! if not, it will use the stored information from the training data.

so if i have a horizon of 24 (hourly data), and an input size of 5*24, and just future exogenous variables (to keep it simple)

What should the shape of `df`  and `futr_df`  be (in terms of rows)? seems like `futr_df`  would have 24 rows, and if im predicting may 12, they'd have ds for all the may 12 hours. If ive trained up to, may 10th, I'm a bit confused what the `df` should look like

Think of the `predict` function as the `forward` step of the model. It will receive `df`+`futr_df` as inputs.
•  `df` should end at 11pm may 11 (or the last timestamp before the one you want to predict), and have at least 5*24 rows if you only have 1 time series (the function is intelligent, if you pass more data, it will only use the last 5*24 timestamps). 
•  `futr_df` should have the exogenous future values for the date you are forecasting, may 12 in this case, with 24 rows.
This applies for any date you are forecasting, once you pass a new `df` , it will not use the training data ending in may 10th.

Hi <@U055ZP43Z4K>. By period do you mean the forecast horizon?

yes how can i choose this period in general

The forecast horizon usually depends on the application and task

In most cases it is "given" by the problem itself. For example, in electricity price markets practitioners are interested in day ahead forecasts (h=24). If it is not confidential, can you give more details on your application?

we also use this period in the past right to train ? or not

Yes, models are trained based on the train loss, which depends on the error on the forecasting window, determined by the horizon

or there any constrain also on the how i should put data to train my models

i dont know if i can put more details buecase i am doing my stage so i should select the most amount for train and for predict

es, models are trained based on the train loss, which depends on the error on the forecasting window, determined by the horizon horizon determin the size of widow ?

Models are usually trained on historic data of your variable of interest. You should have a longer history than the horizon. During training, the models are internally creating smaller windows of size `input_size+h` . Using the input window if size `input_size` to forecast the future of size `h`

Some common train/test splits percentages are 80/20 or 90/10. You should include enough data to train models. The test data is only reserved for evaluating your model on historic data

but what about how much can i predict if i have three month for trained data

the horizon must be specified before training the model, models specialize on forecasting that particular horizon. You can't increase it afterwards. What frequency is your data?

With 3 months of data you might have enough to forecast up to a day, or even week. The performance usually degrades for longer horizons, it is empirical and highly dependent on the data.

yes if i did for loop and connect the data and stored and predict for h is this reasonable ?

Yes, recursive forecasts are a valid and largely used technique. However, we have observed that directly forecasting everything with a large `h` performs better. The performance of you recursive approach might sharply decrease after a couple of iterations due to error accumulation.

depend on what i should choose the input size ?

As a very general rule of thumb, we usually try different input sizes ranging from 2*h to 5*h

i am not sure if i understand where we use this one

I ran the following code multiple times -

nhits = NHITS(h=56,
              input_size=560,
              max_steps=1,
              random_seed=42,)
fcst = NeuralForecast(models=[nhits],freq='D')
fcst.fit(df=train_fin)
pred_nhits = fcst.predict(random_seed=42)

I ended up getting the same error percentage when comparing predictions and actuals for the next 56 days: 36.01%

When I ran the following code multiple times -

nhits = NHITS(h=56,
              input_size=560,
              max_steps=100,
              random_seed=42,)
fcst = NeuralForecast(models=[nhits],freq='D')
fcst.fit(df=train_fin)
pred_nhits = fcst.predict(random_seed=42)

I get different error percentages each time as the predictions are different each time. However the error percentages are quite low ranging between 0.1 to 2%. Model does really well!

It would be very helpful if you can recommend a way I can get stable results as well as lower error rates. Thankyou!

Generally speaking, if the model sees data that is closer "to the reality" it will perform better. For sales data this normally means that is is appropriate to fill missing dates with 0s given that that is the correct amount of sales for that period.

That being said it is crucial to try to include an "out of stock signal" so the model knows that maybe sales where 0 because there wasn't any stock.

Thank you so much for that answer!
We actually keep selling if we're out of stock and just have longer delivery times. Still would be interesting to see how the longer delivery times effect the sales.
The out if stock information would be a historical exogenous variable I assume?
I actually played with those variables a little bit for another problem but can't get to a result I would expect. We always have a massive spike of sales on Black Friday. If I add the historical exogenous variable with a 1 on Blackfriday and zeros on all dates, I assumed that if I add Blackfriday to the future exogenous variables, I would see a spike in the predictions. But it doesn't seem to have an impact.
Am I missing something?

<@U035DR8HD6D> Do you have feedback here? Should I include external dates such as black friday as suggested with 0s and 1s?

Sorry for ghosting you <@U057VD4F17D>! Hi <@U056874JKHD>.
Yes, out if stock information would be a historical exogenous variable, something like SpecialDate and NotSpecialDate as a dummy. And
you can also encode the "distance" to the event. Something like 3,2,1, for 3, 2 and 1 days before and after the event.

I found <https://github.com/Nixtla/neuralforecast/discussions/470>

which suggests filtering out prediction values for the first p results, but would this model step `p+h`  hours when fitting the model?

Hi <@U05778P619D>. This seems to be a particular case that it can't be currently handled. There are partial solutions, like increasing the forecast horizon and changing `step_size` to be 24h instead of the horizon.

Bildschirm­foto 2023-05-17 um 18.48.39.png

I don't know your data but MAPE is not a good measure to use specially if your data has values close to zero in some regions. No matter what your absolute error is, you will be dividing by zero or something very close to zero and you may end up with large numbers. I had the same issue and I moved away from MAPE and started using MAE to understand the performance of my models.

So this is just the training data, not the evaluation data. Nevertheless, do you have any experiences for the hyperparameters? Also from my experience, exogenious variables have nearly no effect or even worsen the results

Hi <@U056874JKHD>. The spikes of the time series appear to be somehow random. No model will be able to predict those spikes without additional information. The high MAPE might be driven by those unpredictable spikes. I also recommend you to fit simpler models as baselines from `statsforecast` .

<@U056874JKHD>  as Cristian suggested, the spikes will be hard to predict. If you don't care about the spikes (i.e., you can consider them outliers), then you can use moving averages to smooth them out or if you work with frequency techniques, you can use a low pass filter to remove them. That or find a feature that is correlated with the spikes.

Looks a lot like power prices, in which case add the reserve margin features and a ton of planned transmission outages :disguised_face: 

It is actually sales data of an e-commerce brand

Must be website traffic that leads to the spikes

Hey <@U056874JKHD>,

We recently included a HuberLoss function that is robust to outliers:
- <https://nixtla.github.io/neuralforecast/losses.pytorch.html#huberloss>

Another idea is to try gradient clipping. Or preprocessing to skip the outliers.
Let us know if the HuberLoss helps you.

Hey <@U056874JKHD>,

We just recently added also HuberMQLoss and this [robust regression tutorial](<https://nixtla.github.io/neuralforecast/examples/robust_regression.html>).
Let us know if they help

Hey <@U058CPPA9D2>,

Thanks for using the library.
Are you trying to use the research paper repo, to complement NeuralForecast's NBEATSx?

In the <https://arxiv.org/abs/2104.05522|original NBEATSx paper> we only used future exogenous variables with a TCN to create its basis.
Lets schedule a meeting to discuss the historic exogenous variables. Send me a Google meets invite.

Meanwhile if you manage to add into NeuralForecast the future TCN basis, it would be a great addition!
:fire::fire::fire:

Hey <@U038ZT9UDGA>! Yup, I've been using the TFT and NHITS implementations in NF as a basis for my own work, and I've been working on getting nbeatsx in the same environment.

Also thanks for the clarification on the exogenous variables.  I'd love to set up a meeting with you maybe on Tuesday or Wednesday next week to discuss how the historical exogenous should be handled. What time would work for you?

Hey <@U03DDFKMT0X>,

These are documentation along with the scripts we run on our submission to Neurips:
• <https://github.com/dluuo/HINT/blob/main/nbs/run_autohint.ipynb>
• <https://nixtla.github.io/neuralforecast/models.hint.html>
• <https://nixtla.github.io/neuralforecast/models.html#autohint>
Let us know if you have questions with your execution.

<@U038ZT9UDGA> if you have something similar for StemGNN that would be helpful as well :) so many new architectures to try out!!

Oh, as an aside. Is there a way to include lagged features in the neuralforecast models?

Last question, I promise! When I forecast over different horizons, say `h=1` vs `h=2`, I get very different results with respect to the one-period ahead forecast. Why would this be the case? I am determining the size of the validation set in the `fit` method. Are there other places where the horizon features in tuning of hyperparameters? I have fixed the `input_size` as well, so that the `input_size` is not dependent on the horizon.

Hey <@U04UTT23343>,

• The config seems correct, 
• The "input_size" controls for the lags/autorregresive features of the method, it is convenient to make it as multiples of horizon, (example for 24 hours ahead you may want input_size=2*24 or input_size=7*24).
• Regarding the model variance between horizons, 
    ◦ you can try an NBEATSx architecture with `dropout_prob_theta` regularization.
    ◦ a robust loss like MAE/HuberLoss (<https://nixtla.github.io/neuralforecast/losses.pytorch.html#huber-loss>).
    ◦ increase the valid_size, to improve the validation signal h=1, h=2, is a very small window and the hyperparameter optimization will be noisy (<https://nixtla.github.io/neuralforecast/common.base_auto.html>).

Thanks for the feedback.

I am working with quarterly data and I am mostly interested in the one period ahead forecast, thus the `h=1` . I realise after playing around with the code that the `valid_size` is dependent on the horizon, so I fixed that to be about 10% of the entire sample size (at around 25).

I am also interested in the longer term forecast `h=4`, which gives me all the quarterly forecasts up to one year ahead. However, this is where I notice a significant difference between my specification with `h=1` and `h=4` . Nothing is different between the models and I use the MAE loss function, like you specified. I might be missing something here.

Then finally, I tried the regularisation with the NBEATSx architecture and NHITS, but I am getting some errors with regard to dropout. The code runs fine without it, but once I include the dropout component it breaks for some reason.  I have `"dropout_prob_theta": tune.choice([0.1, 0.3])` specified in the config.

You might need to update NeuralForecast from main:
!pip install git+<https://github.com/Nixtla/neuralforecast.git>

pip install git+<https://github.com/Nixtla/neuralforecast.git> --upgrade

hello i get this error when try to use cross validation with automodels RuntimeError: maximum size for tensor at dimension 2 is 2751 but size is 3360
does any one have idea?

Hey <@U05687SC39Q>,

Thanks for using the NeuralForecast library.
• It is always advisable to compare against simple baselines <https://nixtla.github.io/statsforecast/examples/models_intro.html#baseline-models>
• You can try to fit a NeuralForecast directly on your data, or you can try to use a pretrained/finetune model too <https://nixtla.github.io/neuralforecast/examples/transfer_learning.html>

image.png

Hey <@U054J4HKQA2>,

A common problem I have faced while working with hierarchical data, is that usually you want to preserve the hierarchical order of the unique_ids.

If the unique_id is a string NeuralForecast sorts it lexicographically. To maintain the hierarchical order of the unique_id I like to change it from string into a categorical variable.

Here is an example with a `sort_df_hier` function : <https://nixtla.github.io/neuralforecast/examples/hierarchicalnetworks.html>

Let me know if this helps.

<@U038ZT9UDGA> Thank you for the insight - I'm confused, however, as to why the order of the unique_id's would change the results prior to reconciliation? Does it have something to do with how the time series are batched during training?

The bootstrap reconciliation is performing a matrix operation that requires the series to be hierarchically ordered:
y_tilde = S P y_hat
<https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/models/hint.py#L259>

<@U038ZT9UDGA> I should clarify that I am not using HINT or any reconciliation method yet. I am merely using NHITS with NeuralForecast. The biggest difference is that the way the unique_ids are formatted and the order of them seem to matter. When I use unique_ids with the "/" in them I get different results than the unique_ids that have "_" in them. Even if the order is completely the same.

Sorry <@U054J4HKQA2>,

I would need more information to know what is breaking in your application, still my intuition is that taking care of the unique_id ordering should do the trick.

Hey <@U05B45BNKT2>,

Thanks for using the library. Here is a tutorial on transfer learning with NeuralForecast:
<https://nixtla.github.io/neuralforecast/examples/transfer_learning.html>

Sorry, I typed my question wrong, I have seen this. What I wanted to know if is their a way to unfreeze some of the layers and tune them to a given dataset?

We have not performed fine tune to the fullest extent, where we freeze the encoder and tune the last decoder layers.
For the moment, like in the tutorial, we tune the whole network.

If you are interested in the fine-tune approach here is an example:
<https://wandb.ai/wandb/wandb-lightning/reports/Transfer-Learning-Using-PyTorch-Lightning--VmlldzoyODk2MjA>

I think it should be easy to implement something similar on the NeuralForecast models too

Welcome <@U05BARTEF40>,

Hope you find NeuralForecast useful. NeuralForecast uses the <https://www.sciencedirect.com/science/article/pii/S0169207020301850|cross-learning technique> where the model outputs independent predictions for each time series, while model's weights are being shared and trained over panel data.

The 'unique_id' feature is not used by any model directly, but through the dataloaders that feed the data 'independently' to the NeuralForecast models.

Currently most NeuralForecast models operate as univariate models in the sense that they receive univariate inputs (lags/autorregresive faetures) and output univariate forecasts. But the model is shared through the entire panel dataset.
The training objective is:
P(Y_{i,t+1:t+h}| Y_{i,t-L:t})

Some more complex models like the ESRNN use the 'unique_id' to query series' specific embeddings.

Thank you so much for the prompt response! Just to make sure I understand properly. The training objective is essentially to learn the sequential behavior conditioned on i, instead of viewing each series as an independent sample (e.g. the sequence does not need to have the same length.)

And thus, it will only uses the information from the series itself instead of the other series at previous time stamps for predictions.

Models operate as univariate autorregresive, but the model weights are shared across the series

After reading the cross-learning paper, I am even more confused. My current understanding is that given multiple series {x}, {y}, {z}, the traditional TS model will build 3 independent model, whereas CL will build a single model for this sequence {(x,y,z)}. However, based on your description, neuralforcast is building a single model for ({x},{y},{z}).join(). Could you please make the clarification?

Also, could you please point me to the specific place 'unique id' is used? I am currently not sure about how the code is able to utilize the correlation of multiple series at the same time stamp.

Hi <@U05BARTEF40>. It is more similar to the second option (if I understood it correctly). I will try to explain it again:
• There is only one model (and therefore set of learned parameters) for all the time series in your dataset.
• The models are UNIVARIATE. They only learn to forecast time series with their own historic values. For example {x_t-L:t}-&gt;{x_t:t+H}, and so own. This has the consequence that forecasts are independent between time series of your dataset.
• Think of models as learning functions from historic windows of a single time series (and exogenous covariates) to future values, {x_t-L:t}-&gt;{x_t:t+H}.

You can find examples in most of the tutorials in our library. This is our particular tutorial on inputs: <https://nixtla.github.io/neuralforecast/examples/data_format.html>.

Thank you! This tutorial is super helpful. I think I was partially confused by the original learn cross learning paper, I found the notation from this survey <https://arxiv.org/abs/2004.10240> to be much more clear.

Hey <@U05B45BNKT2>,

Assuming that you encoded text information from the series, that remains static you can feed it to NeuralForecast through stat_exog_list

Consider 3d embeddings that you put in a pandas stat_df with columns ‘e1’, ‘e2’, ‘e3’
You send stat_exog_list=[ ‘e1’, ‘e2’, ‘e3’]

Thanks alot, sorry for being a newbie and making you reply all the time

Basically, I have just started a phd in wind speed forecasting so was trying different things. I wanted to know that if I want to do research would it be better to stick with nixtla or use pytorch? Or a combination of both wherever applicable. Thanks

NeuralForecast is built on top of Pytorch and Pytorch Lightning

If you want to do your own models research, it is a good starting point

Hey <@U05A9TC7339>

Here are two pointers that may help you.

Hyperparameter optimization tutorial:
<https://nixtla.github.io/neuralforecast/examples/automatic_hyperparameter_tuning.html|https://nixtla.github.io/neuralforecast/examples/automatic_hyperparameter_tuning.html>

AutoNBEATSx code:
<https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/auto.py#L454|https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/auto.py#L454>


Thank you very much for your quick response, Kin. However, I still do not understand it...

In the N-BEATSx class (<https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/models/nbeatsx.py|link>) I clearly see how to include the parameters (including `stat_exog_list` and `futr_exog_list`). However, when I read the AutoNBEATSx class (<https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/auto.py#L454|link>) I do not see how or where these two parameters have to be indicated.

The same with AutoNHITS. I get the error TypeError: AutoNHITS.__init__() got an unexpected keyword argument 'futr_exog_list'.

You have to put them in the configuration search space defined by the `config` , like this:
```nbeatsx_config = {
       "max_steps": 100,                                                         # Number of SGD steps
       "input_size": 24,                                                         # Size of input window
       "learning_rate": tune.loguniform(1e-5, 1e-1),                             # Initial Learning rate
       "n_pool_kernel_size": tune.choice([[2, 2, 2], [16, 8, 1]]),               # MaxPool's Kernelsize
       "n_freq_downsample": tune.choice([[168, 24, 1], [24, 12, 1], [1, 1, 1]]), # Interpolation expressivity ratios
       "stat_exog_list": tune.choice([['s1','s2'],['s1']]),
       "futr_exog_list": tune.choice([['f1','f2'],['f1']]),
       "val_check_steps": 50,                                                    # Compute validation every 50 steps
       "random_seed": tune.randint(1, 10),                                       # Random seed
    }```

Hi <@U055ZP43Z4K>. The library does not have a particular function for checking seasonalities.

Hey <@U0512USU2KF> ,

Thanks for reporting the issue, can you add it here with pointers to the cause?
<https://github.com/Nixtla/neuralforecast/issues|https://github.com/Nixtla/neuralforecast/issues>

If you add the example or lines that replicate the error that can be helpful to debug.

Hey <@U05C5MLB2A1> ,

It seems that your signal is really stable.
Have you tried using StatsForecast baseline models?
<https://nixtla.github.io/statsforecast/src/core/models.html#baseline-models|https://nixtla.github.io/statsforecast/src/core/models.html#baseline-models>

no, I am using the anomaly detection model

You can use baselines for anomaly detection too.

ok, I am following the example basically.

I see that I need to change my whole approach again as it expects a numpydarray

Can you anyway answer the inital question? It’s not on me or whats the error there?

<https://nixtla.github.io/statsforecast/src/core/models.html#mstl|https://nixtla.github.io/statsforecast/src/core/models.html#mstl>

You might be able to try a simpler model for the trend instead of the underlying AutoETS, from your plots it is not very clear if you have seasonalities.

I recommend you to start with a simpler model than MSTL.

ok anyway, why is the Y axis not correct? Can you spot any error?

Well the season_length is definitly not correct in my code. but this docu tbh is also not easy to read through woow

Trying to go thorugh it and it is pretty hard to get started here

We appreciate the comment and we are open to contributions :3

do you have a discord or something for a voice chat?

not sure if this is a bug too:

Probably the factory isn’t working?
`
```---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
&lt;ipython-input-154-3d17808d20b9&gt; in &lt;cell line: 1&gt;()
----&gt; 1 StatsForecast.forecast(h=48,df=filtered_df)

TypeError: StatsForecast.forecast() missing 1 required positional argument: 'self'```


<@U05C5MLB2A1>, which time zone are you in? Happy to jump into a call and help you out.

&gt; Well the season_length is definitly not correct in my code. but this docu tbh is also not easy to read through woow
We are sorry you had that experience, can you help us understand the parts that were not clear so we can improve them?

Regarding you initial question:
• y should be the original value coming from your data. If the Y-Axis only shows values between 3,1,2 its probably because that is the data that you have. (To be clear, y is not applying any modification to your data frame)
• Could you please validate your data by doing other plots (line or distributions) or even counting unique values? 

Hi <@U035DR8HD6D> thanks for your response. I am in GMT+1 but am flexible. Hit me up when you’re free :slightly_smiling_face:

Hi <@U05BARTEF40>! Can you add a `<http://pd.to|pd.to>_datetime()` to your `ds` column before the `fit` function?

We are working on adding a protection to facilitate the debugging

Thank you! It works now. I realized that the dataset I used though following the format is still in the string type

Hi <@U04UTT23343>. Are you using `horizon=1` ? We fixed this issue in a recent Pull Request to the main branch (solving this issue <https://github.com/Nixtla/neuralforecast/issues/609>). The fix is available installing the library from the main branch. We will make a new release to PyPi next week.

Hi Christian, yes, I am using horizon=1. In my case this forecasting horizon is applicable, since I am working with quarterly data (low frequency).

I will install from the main branch and see if it resolves the issue, thanks for responding so quickly.

Hi <@U05B45BNKT2>. This is still an open question, we are currently working on this problem as part of our research program (<https://forecasters.org/programs/research-awards/iif-sas/>). For now, I can say that it is mostly empirical. It might work in some cases and for some models, in particular if time series are "similar". We have a tutorial on transfer learning here: <https://nixtla.github.io/neuralforecast/examples/transfer_learning.html>. We would be happy to hear your results on transferring the model between frequencies :slightly_smiling_face:

Never mind, my colleague informed me that M4 is a well known data set :slightly_smiling_face:

Hey <@U05BARTEF40>,

The insample validation evaluation is a feature that we have not implemented.

It would require to modify BaseWindows validation step to sample within.

Could you please expand on that? I thought I could simply use the `nf.predict` and directly apply subset of the dataset.

Hi <@U05BARTEF40>, we have other options as well. For example, the `cross_validation` method uses a validation set (of size `val_size`) for model selection, and then automatically produces the forecast for the entire test set (of size `test_size` or `n_windows` of size `h`).

Alternatively, you can use `predict_insample` (run it after the `fit` or `cross_validation` ) method to recover the forecasts for the entire train AND validation sets. You can then filter the forecasts however you want.

The `predict_insample` already returns the true values in the `y` column as well, here is the tutorial: <https://nixtla.github.io/neuralforecast/examples/predictinsample.html>

Be careful on focusing only on the validation set, it is not ideal to do model selection on the train set.

And depending on your use case, `cross_validation` is already doing model selection for you. It essentially covers the entire pipeline. (train model on the train set, select on validation set, predict on test set).

I think the issue is that predict_insample can only apply to the default dataset. So I think you point is that after performing the cross validation, then apply predict_insample?

Could you please explain the cross validation a little bit? It seems to require re-training (call model.fit multiple times)

Can you provide more details on the pipeline you are trying to build? No, `cross_validation` only trains the model once.

Actually, let me rephrase my question. Cross_validation performs actual training on the dataset though with some additional features. To evaluate the model, I cannot use the same dataset used in the cross_validation dataset. However, predict_insample does not take dataset as input.

If your objective is to compare between models on historic data, `cross_validation` is the way to go. This is the function we have used in our published research, and it is the standard way of comparing the performance of models. The historic data is separated chronologically in train/val/test. Models are trained on the train set, and then it uses the validation set for model selection and hyperparameter tuning (for example, if you use an `auto` model such as the `AutoPatchTST` ). Finally, it returns the forecast on the test set, which was never seen by the model during training.

We compare the performance on the test set.

The more common use case of the `predict_insample` is to recover the forecasts for the train set and validation set. Was this useful? Let me know if you have additional doubts, we can chat using direct messages as well.

I agree with this pipeline, but is there any way that I can simply evaluate a trained model? The evaluation is done in a similar way to predict_insample but on a different dataset.

Ok, I understand you point now. So essentially you want to train a model on one dataset and predict on a new one? (not necessarily with the same time series)

You should use the `predict` method for that, but as you said, it can only forecast one window. You just need to send `predict(df=new_df)`. We actually have a tutorial on transfer learning with this use case here: <https://nixtla.github.io/neuralforecast/examples/transfer_learning.html>

If you want more windows, you can hack it. For example, after training, set `nf.models[0].max_steps=0` . Then pass the new dataset to the `fit` method (set `use_init_models=False` ), then call `predict_insample`.

Hey <@U0583BFS57V>,

Thanks for using NeuralForecast, hope it gives you service.
Your intuitions are correct regarding the downsampling hyperparameter and its effect on modelling different frequencies.
Here is an example of the effects of the parameter: <https://nixtla.github.io/neuralforecast/examples/signal_decomposition.html>

very cool, thank you for pointing me to this!

<@U038ZT9UDGA> So for weekly data with yearly seasonality, can it make sense to use [52, 1]?

Using [52,1] males sense.
Although that configuration is not deep for which you will have a very parsimonious model.

I would keep the exploration space wide including the depth of the network.