Skip to content

rocm-smi failing when using WSL #135

@BrunoBerger

Description

@BrunoBerger

When running any darknet command in WSL with ROCm the program fails inside Darknet::show_rocm_info():

$ darknet version

Darknet V5 "Moonlit" v5.0-157-gc2d2a06b
AMD ROCm v6.4.2.0-120-e7d83f5
- status #8: RSMI_STATUS_INIT_ERROR: An error occurred during initialization, during monitor discovery or when when initializing internal data structures
- status #0: RSMI_STATUS_SUCCESS: The function has been executed successfully.
AMD GPU not detected!
OpenCV 4.5.4, Ubuntu 22.04, wsl

I think this is due to rocm-smi not being supported in WSL:
https://rocm.docs.amd.com/projects/radeon/en/latest/docs/limitations.html#rocm-support-in-wsl-environments

When skipping the rmsi_* code, the program worked as normal, and inference was using the GPU.

I recommend skipping this automatically if the program is detected to be running in WSL, like I did in this fork:
https://github.com/BrunoBerger/darknet/tree/wsl-rocm-fix
using code from:
https://github.com/scivision/detect-windows-subsystem-for-linux

I saw that you already check for alternative OSs later in Darknet::show_version_info(), which prints "wsl" for me, as shown in the output above.
I was not sure how to reliably test the skipping with that method, as I have no way to test the non-wsl case, and thus i did not open a pull request for this.
I am available if you ever want to test something for wsl+rocm.

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions