I have a neuralforecast script that runs fine till predictio Nixtla Community #neural-forecast

I have a neuralforecast script that runs fine till...

Isaac

03/26/2024, 6:33 PM

I have a neuralforecast script that runs fine till prediction time where I just get a

Killed

message. Any idea what could be causing it? Is it moving to much data to RAM?

Isaac

03/27/2024, 11:22 AM

@Marco any ideas? I've been trying to get this script working for a few weeks to no avail.

Kin Gtz. Olivares

04/05/2024, 12:06 PM

I think we have a tough memory leakage in the inference code @Cristian (Nixtla) @Isaac , @Marco An intuition I have is that we are using pytorch lightning's trainer class in an unintended way. And using it inside of the models' through their fit and predict methods. My belief is that if we switch from having the Model.fit to be calling the trainer.fit, and the Model.predict to be calling the trainer.predict we might be able to solve this. The overall problem is that, we might have a crazy recursion: Trainer(Model(Trainer))

Cristian (Nixtla)

04/05/2024, 4:31 PM

@José Morales

Cristian (Nixtla)

04/05/2024, 5:16 PM

Thanks for your insights @Kin Gtz. Olivares. What you are mentioning should only be applicable for multi-GPU. Is this your case Isaac? If you are using one GPU/CPU the more likely cause is running out of memory. Isaac have you tried reducing

inference_windows_batch_size

? For the multi-GPU case we will release a new feature next week for optimized distributed training with Spark.

Kin Gtz. Olivares

04/08/2024, 12:38 PM

You can also reduce the

validation_batch_size

@Isaac That way you maintain your memory constrained

Kin Gtz. Olivares

04/08/2024, 12:43 PM

Another thing @Cristian (Nixtla), we might need to add

Copy code

fcsts = fcts.detach().cpu()
fcsts = torch.vstack(fcsts).numpy().flatten()

https://github.com/Nixtla/neuralforecast/blob/main/neuralforecast/common/_base_windows.py#L720 The detach operation blocks the gradients, and has solved a memory leakage for me in the past

Open in Slack

Previous Next