Thanks for the code
We were experiencing issues running this code with multiple GPUs with the -p argument, where the stream does not match the device and the program deadlocks if it does match
I suggest the following:
- The stream should be initialised after the device
pynvml.nvml.Init() should only be called once, rather than at every thread
- Avoid race conditions for
stop_flag
I've made those fixes on my fork. Let me know what you think and I'll be happy to make changes or a PR
Thanks for the code
We were experiencing issues running this code with multiple GPUs with the
-pargument, where the stream does not match the device and the program deadlocks if it does matchI suggest the following:
pynvml.nvml.Init()should only be called once, rather than at every threadstop_flagI've made those fixes on my fork. Let me know what you think and I'll be happy to make changes or a PR