PageAttention Exploit in vLLM Inference Engine
CVE-2025-46570
2.6LOW
What is CVE-2025-46570?
The vLLM inference engine is susceptible to a timing attack due to its PageAttention mechanism, which improperly optimizes prompt processing. When a prompt matches a previously processed prefix, the engine accelerates the prefill process, creating significant timing differences. This behavior may be maliciously exploited before the issue was corrected in version 0.9.0. Users are strongly advised to update to the latest version to secure their systems against this vulnerability.
Affected Version(s)
vllm < 0.9.0