Is CUDA 11 absolutely required?

Hello,

I’ve been trying to run “make test-submission” but it’s showing

could not select device driver “” with capabilities: [[gpu]].

I have reinstalled my CUDA driver (12.4.1), nvidia-utils (550.120-1), as well as nvidia-container-toolkit (1.16.1-3) but nothing so far seems to work.

So I’m wondering if the problem is my CUDA version, because the runtime github repo stated that we are required to have CUDA 11. Can someone confirm this for me, I really appreciate this.

System Information:
Operating System: Manjaro Linux
KDE Plasma Version: 6.1.5
KDE Frameworks Version: 6.6.0
Qt Version: 6.7.2
Kernel Version: 6.11.2-4-MANJARO (64-bit)
Graphics Platform: X11
GPU: Nvidia RTX 2060 Super

Hi @Kaungkhantko,

Based on my understanding of CUDA drivers and the NVIDIA Container Toolkit, they should be backwards compatible, in the sense that the relatively recent versions you have installed on the host should support a CUDA runtime library in the container that is an older version like 11.8.

Can you provide more logs or more information about what is writing out the error message that you’ve shown?

Can you also confirm that the image you are using is the GPU version of the image? Is it a locally built image, or is from make pull? When you run make test-submission, I believe it should print out the name of the image before it starts the container.

1 Like

This is what “make test-submission” prints out:

Using image: cdc-narratives-competition:gpu-local (3d25cf77c6ba)

┃ NAME(S)
┃ cdc-narratives-competition:gpu-local

Available official images:

┃ REPOSITORY TAG IMAGE ID CREATED SIZE
cdcnarratives.azurecr.io/cdc-narratives-competition gpu-latest 081ea9282b0f 5 days ago 22.5GB

Available local images:

┃ REPOSITORY TAG IMAGE ID CREATED SIZE
┃ cdc-narratives-competition gpu-local 3d25cf77c6ba 3 days ago 18.9GB

mkdir -p submission/
chmod -R 0777 submission/
docker run
-it
–gpus all
–network none
-e LOGURU_LEVEL=INFO
-e IS_SMOKE_TEST=true
–mount type=bind,source=/home/kaung/youth-mental-health-runtime/data,target=/code_execution/data,readonly
–mount type=bind,source=“/home/kaung/youth-mental-health-runtime/submission”,target=/code_execution/submission
–shm-size 8g
–pid host
–name cdc-narratives-competition
–rm
3d25cf77c6ba
docker: Error response from daemon: could not select device driver “” with capabilities: [[gpu]].
make: *** [Makefile:198: test-submission] Error 125

I’ve also recently tried running this "sudo docker run --rm --gpus all cdcnarratives.azurecr.io/cdc-narratives-competition:gpu-latest " and got a different message:

  • main
  • tee /code_execution/submission/log.txt
    tee: /code_execution/submission/log.txt: No such file or directory
  • expected_filename=main.py
  • cd /code_execution
    ++ zip -sf ./submission/submission.zip

I have a feeling that using the online image helped me fix the issue of not detecting the gpu, and that’s why I’m seeing a different error here.

Hi @Kaungkhantko,

The fact that you were able to not get a GPU error from using the pulled cdcnarratives.azurecr.io/cdc-narratives-competition:gpu-latest image is good—it means that there’s something specific about the first case that is not working, but that your overall setup should be fine.

Since you have a local image built, it’ll default to using that when you use the Makefile commands. You’ll need to use the SUBMISSION_IMAGE environment variable to specify a different image, like:

SUBMISSION_IMAGE=cdcnarratives.azurecr.io/cdc-narratives-competition:gpu-latest make test-submission

(this is a long command, make sure you grab the whole line)

Alternatively, you can delete your local image.

The reason you’re getting a different error is because the image expects several mounted directories, which your sudo docker run command does not have. It’s these lines you see in the make test-submission printout:

–mount type=bind,source=/home/kaung/youth-mental-health-runtime/data,target=/code_execution/data,readonly
–mount type=bind,source=“/home/kaung/youth-mental-health-runtime/submission”,target=/code_execution/submission
1 Like