Has anyone benchmarked NHiTS performance on 1 vs. multiple GPUs? I am comparing it between EC2 g3.4xlarge (1 GPU) and g3.8xlarge (2 GPUs). The runtime slightly increases when running with more GPUs!
I am testing on one time series right now. Maybe training for only one time series is not parallelized across multiple GPUs (i.e. I need to add more time series to see an effect)?! Also, I was first running in a Jupyter notebook and thought maybe notebooks have an issue with multiple GPUs but I see similar results out of a python script.