Memory Corruption Vulnerability in vLLM Inference Engine from Vendor vLLM
CVE-2025-62164
8.8HIGH
What is CVE-2025-62164?
The vLLM inference and serving engine for large language models contains a memory corruption vulnerability in its Completions API endpoint. The flaw arises from insufficient validation when loading user-supplied prompt embeddings with torch.load(). A security change in PyTorch 2.8.0 disables certain tensor integrity checks by default, enabling potentially malicious tensors to bypass internal bounds checks. This can lead to an out-of-bounds memory write, resulting in a denial-of-service situation or, in some cases, remote code execution on the server hosting the vLLM engine. Users of vLLM versions 0.10.2 to before 0.11.1 should upgrade to version 0.11.1 to mitigate this risk.
Affected Version(s)
vllm >= 0.10.2, < 0.11.1
