6.5 MEDIUM
- CVSS version: 3.1
- Attack vector (AV): NETWORK
- Attack complexity (AC): LOW
- Privileges required (PR): LOW
- User interaction (UI): NONE
- Scope (S): UNCHANGED
- Confidentiality impact (C): NONE
- Integrity impact (I): NONE
- Availability impact (A): HIGH
vLLM Affected by Unauthenticated OOM Denial of Service via Unbounded `n` Parameter in OpenAI API Server
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
References
-
https://github.com/vllm-project/vllm/security/advisories/GHSA-3mwp-wvh9-7528 x_refsource_CONFIRM
-
https://github.com/vllm-project/vllm/pull/37952 x_refsource_MISC
Affected products
- ==>= 0.1.0, < 0.19.0
Matching in nixpkgs
pkgs.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.pkgsRocm.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.python312Packages.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.python313Packages.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
Package maintainers
-
@happysalada Raphael Megzari <raphael@megzari.com>
-
@CertainLach Yaroslav Bolyukin <iam@lach.pw>
-
@daniel-fahey Daniel Fahey <daniel.fahey+nixpkgs@pm.me>