Manuel Chabier Escolá Pérez

07/18/2023, 2:48 PM
Hi all! I am new working with GPUs and I am having a problem I am not sure how to solve. I am working with the basic GPU in google colab. However, when trying to train an NHITS model I get this error: OutOfMemoryError: CUDA out of memory. Tried to allocate 10.81 GiB (GPU 0; 14.75 GiB total capacity; 11.04 GiB already allocated; 2.83 GiB free; 11.05 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF The df_train is a 15T frequency dataset with a shape of (55969, 47). In the 47 columns are already included the columns unique_id, ds and y. My code is as follows: horizon = 48*4 # day-ahead daily forecast futr_exog_list = list(df_train.columns[8:]) hist_exog_list = ['POA_Irradiance_SF_log', 'GHI_Irradiance_SF_log', 'Ambient_Temperature_SF_log', 'Wind_Speed_SF_log', 'Module Temperature_SF_log'] models = [NHITS(h = horizon, input_size = 5*horizon, futr_exog_list = futr_exog_list, # <- Future exogenous variables hist_exog_list = hist_exog_list, # <- Historical exogenous variables # scaler_type = 'robust' # <- This is not necessary since the dataframe is already scaled )] I try to train the model as follows (and it is when the error rises): %%capture nf = NeuralForecast(models=models, freq='15T') Is there any easy way to solve this problem considering that the dataset has a shape of only (55969, 47)? Thank you very much for your time!

Cristian (Nixtla)

07/18/2023, 4:16 PM
Hi @Manuel Chabier Escolá Pérez! Can you remove the line with
so that it prints where is running out of memory? The problem is more likely that the horizon is large (192) and the
is 5 times that. Plus all the 47 exogenous variables for the same window size. Some suggestions are: • Reduce the input size (1,2, or 3 times the horizon) • Reduce the
, and
. • Make your dataset smaller, by either removing the less informative exogenous variables or use only the latest data. The GPUs in Google Colab are usually very small, with very limited RAM, so you will be limited in the model's or data's size. The ideal case is to use a larger GPU, we use AWS EC2 instances. And iterate, reduce the previous hyperparameters and size of data until it fits in the memory of your GPU.