Hey all, I'm interested in doing multivariate->...
# neural-forecast
z
Hey all, I'm interested in doing multivariate->multivariate forecasting. At least at first primarily with TSMixerx. I saw this thread that provided some advice (removing
insample_y
), but I assume there's a decent amount in the nf/dataset/loss side that would need to be modified to support multiple targets as well. I guess I'd love to hear how difficult you think this modification would be and any tips for diving into it.
o
If the targets can be structured as time series, I don't see why (any) modifications would be necessary? Just choose n_series=n_targets, and for each series make sure you have all the additional information you want (hist_exog, stat_exog, futr_exog)
z
Oh awesome, how should the input data be shaped in that case? I should mention I'm also doing prediction with multiple time series (each of which is multivariate) but I've so far used n_series=1 as I want to train a model that can predict entirely new series (not 100% sure this is correct?). Thanks for the quick response, awesome library!
o
You could try structuring your data like in this tutorial, which does what you want: https://nixtlaverse.nixtla.io/neuralforecast/docs/tutorials/multivariate_tsmixer.html
👍 1
(and then use TSMixerx, of course)
z
I've given this a shot but getting an error part way through the first epoch in fit. Specifically: RuntimeError: mat1 and mat2 shapes cannot be multiplied (768x4071 and 4761x64) Reproducible from:
Copy code
import pandas as pd

from neuralforecast.models.tsmixerx import TSMixerx
from neuralforecast import NeuralForecast


nf = NeuralForecast([TSMixerx(
    h=24,
    input_size=48,
    n_series=69,
    revin=False,
    batch_size=32,
    hist_exog_list=['ex_0', 'ex_1', 'ex_2', 'ex_3', 'ex_4', 'ex_5', 'ex_6', 'ex_7', 'ex_8', 'ex_9', 'ex_10', 'ex_11', 'ex_12', 'ex_13', 'ex_14', 'ex_15', 'ex_16', 'ex_17', 'ex_18', 'ex_19', 'ex_20', 'ex_21', 'ex_22', 'ex_23', 'ex_24', 'ex_25', 'ex_26', 'ex_27', 'ex_28', 'ex_29', 'ex_30', 'ex_31', 'ex_32', 'ex_33', 'ex_34', 'ex_35', 'ex_36', 'ex_37', 'ex_38', 'ex_39', 'ex_40', 'ex_41', 'ex_42', 'ex_43', 'ex_44', 'ex_45', 'ex_46', 'ex_47', 'ex_48', 'ex_49', 'ex_50', 'ex_51', 'ex_52', 'ex_53', 'ex_54', 'ex_55', 'ex_56', 'ex_57', 'ex_58', 'ex_59', 'ex_60', 'ex_61', 'ex_62', 'ex_63', 'ex_64', 'ex_65', 'ex_66', 'ex_67']
)], freq="h",)

nf.fit(
    pd.read_parquet("anon_frame.parquet")
)
Where anon_frame.parquet is: https://drive.google.com/file/d/128OgJqbbWZGeOoW1DcBInBEykaO9zCIK/view?usp=sharing. I traced this back to train_step getting a batch arg with a reduced n_series dimension, but I'm not sure why that's happening.
o
Your n_series=10 in that dataset, not 69.........
n_series is the number of unique_ids, please follow the tutorial and structure the data exactly as in the tutorial
z
Right but that tutorial isn't what I want to do. Might be easier to understand with context, suppose this is an industrial process. Each unique_id is a run of the process which produces a large number of series. I want to train a model that, given part of a single run, forecasts all of these input series. There are an unlimited number of possible runs, all of which are different, and a fixed number of series that contain data about the process that I want to forecast (69 in this case).
o
Ok - if you want to forecast those 69 series, make sure they are the unique_ids?
z
Then how can I distinguish between runs?
They may overlap in timestamps and should never be considered part of the same input
o
Here's something that works - we drop the duplicate timestamps and add the run_id as an exogenous feature. That way, you can get predictions based on assigned run_id. You could vary which timestamps of run_ids you'd like to drop, of course.
df =  pd.read_parquet("tmp/anon_frame.parquet").reset_index(drop=True)
column_names = [f"series_{i}" for i in range(69)] + ["run_id", "ds"]
df.columns = column_names
df = df.set_index(["run_id", "ds"]).stack().reset_index()
df.columns = ["run_id", "ds", "unique_id", "y"]
df["run_id"] = df["run_id"].astype(int)
df = df.sort_values(by=["run_id", "unique_id", "ds"]).reset_index(drop=True)
df = df.drop_duplicates(subset=["unique_id", "ds"])
df_test = df.groupby(["unique_id"]).tail(24)
df_train = df.drop(df_test.index).reset_index(drop=True)
df_test = df_test.reset_index(drop=True)
nf = NeuralForecast([TSMixerx(
h=24,
input_size=48,
n_series=69,
revin=True,
batch_size=69,
futr_exog_list=["run_id"],
max_steps=10,
)], freq="h")
nf.fit(df_train)
df_pred = nf.predict(futr_df=df_test.drop(["y"], axis=1))
Note: it seems there's a bug in multivariate models requiring batch_size to be at least n_series. I have to investigate that further. Edit: bug is present, and we addressed it in a PR.
Another option for you is to use a univariate model where unique_id is a concatenation of unique_id and run_id. I think you'd get similar results.
z
Thanks for looking into this! Ideally I don't want the run ID to be observable to the model, as it'll always be tested/used on entirely unseen runs; whether or not this would actually have an adverse impact I'm not sure but I'm a little wary of it. I don't want to use a univariate model because I think there's a lot of information that can be gained from training (via dropout) on A->A,B,C ; B->A,B,C ; A,B->A,B,C etc
o
Ok, then I'd just drop the run_id, i.e. comment
futr_exog_list
out and drop
futr_df
from the predict call