vLLM + CUDA mismatch

amorarc · February 22, 2026, 10:02pm

Just a quick note after seeing this error pop up in the environment while trying to perform a smoke test:

ImportError: undefined symbol: _ZN3c104cuda9SetDeviceEi

This means vLLM was installed for a different PyTorch/CUDA version than what’s actually running. Do vLLM, PyTorch, and CUDA versions all match?

cszc · February 23, 2026, 9:21pm

@amorarc Thanks for flagging this. After investigating, we have decided that we won’t be supporting vLLM in the competition runtime. The runtime has been updated.

vLLM is tightly coupled to specific CUDA and PyTorch versions, and the available wheels are not compatible with our CUDA 12.6 / PyTorch 2.9.0 environment. We recommend using a different accelerated llm runtime if you would like to use one.

decem · March 4, 2026, 6:15pm

The vLLM framework is fully compatible with PyTorch 2.9.0 and CUDA 12.6. The current errors are likely due to the outdated vLLM 0.2.5 version installed during the 433db53 commit. To ensure proper model acceleration and stability, I kindly request the organizers to upgrade vLLM to a version between 0.14.0 and 0.16.0.

cszc · March 5, 2026, 8:01pm

@decem Thanks for this suggestion! If you’re able, please submit a PR to the runtime repository with the proposed vLLM upgrade and we’ll be happy to take a look and review it.

decem · March 6, 2026, 5:43am

My local hardware is a Blackwell architecture GPU that does not support CUDA 12.6, so I’m afraid I won’t be able to submit a correct pull request.

mitchelld12345 · March 6, 2026, 7:10pm

Yeah, I’ve ran into the same issue. Updating the CUDA kernel of the eval environment to 12.8 would be nice so that people using Blackwell can run an identical container although sounds like that might not be possible. I’ve just been using a slightly different Docker definition to run stuff locally.

decem · March 11, 2026, 11:20am

@cszc Hi! I’ve just submitted **PR #5** to re-enable vLLM (v0.14.0) support.

Contrary to my previous concern, I managed to verify this configuration on an RTX 4090 with CUDA 12.6. While I understand the competition runtime uses A100 GPUs, the software stack (CUDA/PyTorch) is now aligned, and the ImportError / symbol issues are resolved on my end.

I’ve addressed the dependency deadlocks between vLLM, NeMo (protobuf), and cuda-python using specific uv overrides, as detailed in the PR’s changelog. Could you please help run a final smoke test on your A100 environment? If it passes, this will restore high-performance inference support for all participants.

Thank you for your time and for reviewing the PR!

decem · March 13, 2026, 5:31am

@cszc I have a significant update on PR #5.

While performing further validation using the competition’s pytest tests/test_imports.py on my CUDA 12.6 environment, I identified and fixed a critical protobuf version conflict that would have broken wandb and NeMo.

The Issue:

vLLM metadata requests protobuf>=6.30.0 (resolving to 7.x), but NeMo and wandb use pre-generated code only compatible with the protobuf 5.x API. Version 6.x+ introduced breaking changes in the generated code format.

The Fix:

I’ve updated the uv override for protobuf to >=5.29.5, <5.30.
This satisfies NeMo’s hard constraints and ensures wandb works correctly.
Note: vLLM inference works perfectly with protobuf 5.29.x (the 6.x requirement is only for its gRPC server features, which aren’t used in this runtime).

Verification Results:

All 9 core tests now pass successfully in the CUDA 12.6 environment:

test_pytorch, test_torchaudio, test_whisper
test_canary_qwen, test_granite_speech, test_phi_4_multimodal
test_parakeet_tdt, test_wav2vec, test_qwen3_asr

The PR is now fully “battle-tested” and ready for your review. Could you please help verify it on your A100 infrastructure?

Thank you!

Topic		Replies	Views
Version consistency between CUDA, PyTorch and other related modules PETs Prize Challenge	9	1643	January 24, 2023
RuntimeError: cuDNN error: CUDNN_STATUS_ARCH_MISMATCH VisioMel Challenge	8	1893	May 7, 2023
Code Submission Issues Water Supply Forecast Rodeo	3	227	January 5, 2024
Is CUDA 11 absolutely required? Youth Mental Health: Automated Abstraction	4	176	November 11, 2024
Runtime Environment Torch not compiled with CUDA enabled PETs Prize Challenge	2	236	January 4, 2023

vLLM + CUDA mismatch

Related topics