CVE-2026-6607
Remote Resource Consumption in lm-sys Fastchat Worker API Endpoint
Publication date: 2026-04-20
Last updated on: 2026-04-20
Assigner: VulDB
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| lm-sys | fastchat | to 0.2.36 (inc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-404 | The product does not release or incorrectly releases a resource before it is made available for re-use. |
| CWE-400 | The product does not properly control the allocation and maintenance of a limited resource. |
Attack-Flow Graph
AI Powered Q&A
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability causes the FastChat model worker's Python asyncio event loop to block during synchronous GPU inference calls, leading to denial of service. Detection can focus on monitoring for symptoms such as unresponsive model workers, delayed or failed health checks, and worker deregistration events.
Specifically, requests to the vulnerable endpoints `/worker_generate` and `/worker_get_embeddings` can cause the event loop to freeze for extended periods (30-120+ seconds), which can be observed by comparing response times or health check delays.
To detect this vulnerability on your system, you can monitor the FastChat worker logs for frequent worker deregistration or health check failures, and measure response times to these endpoints.
While no explicit detection commands are provided in the resources, a practical approach includes sending test requests to the endpoints `/worker_generate` and `/worker_get_embeddings` and measuring response delays or event loop blocking.
- Use curl or similar tools to send requests to the endpoints and observe response times, for example:
- curl -v http://<fastchat-worker-host>/worker_generate -d '{"params": ...}'
- curl -v http://<fastchat-worker-host>/worker_get_embeddings -d '{"params": ...}'
If the event loop is blocked, these requests will take significantly longer to respond or cause the worker to become unresponsive.
Additionally, monitoring the FastChat controller logs for worker deregistration events can indicate that a worker is freezing due to this vulnerability.
Can you explain this vulnerability to me?
This vulnerability exists in the lm-sys fastchat software up to version 0.2.36, specifically in the api_generate function of the Worker API Endpoint component. It allows an attacker to remotely manipulate the function in a way that causes excessive resource consumption.
The vulnerability has been publicly disclosed and can be exploited remotely without authentication. A patch has been issued to fix this issue in the api_generate function, but other entry points may still be vulnerable.
How can this vulnerability impact me? :
The primary impact of this vulnerability is resource consumption, which can lead to denial of service conditions. An attacker exploiting this issue could cause the affected system to become slow, unresponsive, or crash due to exhaustion of resources.
What immediate steps should I take to mitigate this vulnerability?
To mitigate this vulnerability, it is suggested to install the patch identified by commit ff66426 which addresses the issue in the api_generate function of base_model_worker.py.
Applying this patch will help prevent resource consumption attacks initiated remotely via the Worker API Endpoint.
How does this vulnerability affect compliance with common standards and regulations (like GDPR, HIPAA)?:
The provided context and resources do not contain any information regarding the impact of CVE-2026-6607 on compliance with common standards and regulations such as GDPR or HIPAA.