5.9 MEDIUM
- CVSS version: 3.1
- Attack vector (AV): NETWORK
- Attack complexity (AC): HIGH
- Privileges required (PR): LOW
- User interaction (UI): NONE
- Scope (S): UNCHANGED
- Confidentiality impact (C): NONE
- Integrity impact (I): HIGH
- Availability impact (A): LOW
vLLM: Downmix Implementation Differences as Attack Vectors Against Audio AI Models
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.
References
-
https://github.com/vllm-project/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8 x_refsource_CONFIRM
-
https://github.com/vllm-project/vllm/pull/37058 x_refsource_MISC
-
https://github.com/vllm-project/vllm/releases/tag/v0.18.0 x_refsource_MISC
Affected products
- ==>= 0.5.5, < 0.18.0
Matching in nixpkgs
pkgs.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.pkgsRocm.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.python312Packages.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
pkgs.python313Packages.vllm
High-throughput and memory-efficient inference and serving engine for LLMs
Package maintainers
-
@happysalada Raphael Megzari <raphael@megzari.com>
-
@CertainLach Yaroslav Bolyukin <iam@lach.pw>
-
@daniel-fahey Daniel Fahey <daniel.fahey+nixpkgs@pm.me>