#StackBounty: #python #pytorch Pytorch inference CUDA out of memory when multiprocessing

Bounty: 50

To fully utilize CPU/GPU I run several processes that do DNN inference (feed forward) on separate datasets. Since the processes allocate CUDA memory during the feed forward I’m getting a CUDA out of memory error. To mitigate this I added torch.cuda.empty_cache() call which made things better. However, there are still occasional out of memory errors. Probably due to bad allocation/release timing.

I managed to solve the problem by adding a multiprocessing.BoundedSemaphore around the feed forward call but this introduces difficulties in initializing and sharing the semaphore between the processes.

Is there a better way to avoid this kind of errors while running multiple GPU inference processes?

Get this bounty!!!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.