Good afternoon, I've noticed that the inference time varies from 10 seconds to 20 seconds given exactly the same payload, what could be causing this issue?
j
José Morales
07/04/2024, 5:56 PM
We use serverless infra and the cold start is 10-15s, so you're most likely experiencing that on the requests that take longer