Headline

GHSA-x3m8-f7g5-qhm7: vLLM Allows Remote Code Execution via Mooncake Integration

Summary

When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts.

Details

Pickle deserialization vulnerabilities are well documented.
The mooncake pipe is exposed over the network (by design to enable disaggregated prefilling across distributed environments) using ZMQ over TCP, greatly increasing exploitability. Further, the mooncake integration opens these sockets listening on all interfaces on the host, meaning it can not be configured to only use a private, trusted network.
The root problem is recv_tensor() calls _recv_impl which passes the raw network bytes to pickle.loads(). Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.

Impact

This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.

Remediation

This issue is resolved by https://github.com/vllm-project/vllm/pull/14228

3 months ago

ghsa

Open in Source

#vulnerability #web #git #rce #auth

Navigation Menu

- GitHub Copilot
  
  Write better code with AI
- Security
  
  Find and fix vulnerabilities
- Actions
  
  Automate any workflow
- Codespaces
  
  Instant dev environments
- Issues
  
  Plan and track work
- Code Review
  
  Manage code changes
- Discussions
  
  Collaborate outside of code
- Code Search
  
  Find more, search less

Explore
- Learning Pathways
- Events & Webinars
- Ebooks & Whitepapers
- Customer Stories
- Partners
- Executive Insights
- GitHub Sponsors
  
  Fund open source developers

*   The ReadME Project
    
    GitHub community articles

- Enterprise platform
  
  AI-powered developer platform

Pricing

Provide feedback

Saved searches****Use saved searches to filter your results more quickly

GitHub Advisory Database
GitHub Reviewed
CVE-2025-29783

vLLM Allows Remote Code Execution via Mooncake Integration

Critical severity GitHub Reviewed Published Mar 19, 2025 in vllm-project/vllm • Updated Mar 19, 2025

Affected versions

< 0.8.0

Description

Summary

When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts.

Details

Pickle deserialization vulnerabilities are well documented.
The mooncake pipe is exposed over the network (by design to enable disaggregated prefilling across distributed environments) using ZMQ over TCP, greatly increasing exploitability. Further, the mooncake integration opens these sockets listening on all interfaces on the host, meaning it can not be configured to only use a private, trusted network.
The root problem is recv_tensor() calls _recv_impl which passes the raw network bytes to pickle.loads(). Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.

Impact

This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.

Remediation

This issue is resolved by vllm-project/vllm#14228

References