After reading the cross-learning paper, I am even more confused. My current understanding is that given multiple series {x}, {y}, {z}, the traditional TS model will build 3 independent model, whereas CL will build a single model for this sequence {(x,y,z)}. However, based on your description, neuralforcast is building a single model for ({x},{y},{z}).join(). Could you please make the clarification?
Also, could you please point me to the specific place 'unique id' is used? I am currently not sure about how the code is able to utilize the correlation of multiple series at the same time stamp.