CVE-2026-10300
Deferred Deferred - Pending Action
Remote Assertion Failure in SGLang via Inference HTTP Endpoint

Publication date: 2026-06-01

Last updated on: 2026-06-01

Assigner: VulDB

Description
A security vulnerability has been detected in SGLang 0.5.10.post1. Impacted is an unknown function of the file python/sglang/srt/lora/lora_manager.py of the component Inference HTTP Endpoint. Such manipulation of the argument lora_path leads to reachable assertion. The attack can be launched remotely. A high complexity level is associated with this attack. The exploitability is considered difficult. The exploit has been disclosed publicly and may be used. The pull request to fix this issue awaits acceptance.
CVSS Scores
EPSS Scores
Probability:
Percentile:
Meta Information
Published
2026-06-01
Last Modified
2026-06-01
Generated
2026-06-22
AI Q&A
2026-06-02
EPSS Evaluated
2026-06-21
NVD
EUVD
Affected Vendors & Products
Showing 1 associated CPE
Vendor Product Version / Range
sgl_project sglang 0.5.10.post1
Helpful Resources
Exploitability
CWE
CWE Icon
KEV
KEV Icon
CWE ID Description
CWE-617 The product contains an assert() or similar statement that can be triggered by an attacker, which leads to an application exit or other behavior that is more severe than necessary.
Attack-Flow Graph
AI Quick Actions
Instant insights powered by AI
Compliance Impact

The vulnerability causes a scheduler crash in the SGLang server leading to a complete loss of availability of the Inference HTTP Endpoint service.

This loss of availability could impact compliance with standards and regulations that require continuous availability and reliability of services, such as HIPAA which mandates availability of electronic protected health information, or GDPR which requires data processing systems to be resilient and available.

However, there is no direct information provided about data confidentiality or integrity breaches, so the impact on compliance is primarily related to availability requirements.

Executive Summary

This vulnerability exists in SGLang version 0.5.10.post1, specifically in an unknown function within the file python/sglang/srt/lora/lora_manager.py, part of the Inference HTTP Endpoint component.

The issue arises from manipulation of the argument 'lora_path', which leads to a reachable assertion, meaning the program can be forced into an unexpected state or crash.

The attack exploiting this vulnerability can be launched remotely, but it requires a high level of complexity and is considered difficult to exploit.

A fix has been proposed via a pull request but has not yet been accepted.

Impact Analysis

Exploitation of this vulnerability can cause the affected software to reach an assertion failure, potentially leading to a crash or denial of service.

Since the attack can be launched remotely, it could disrupt the availability of the Inference HTTP Endpoint component.

However, the exploitability is difficult due to the high complexity required to successfully carry out the attack.

Detection Guidance

This vulnerability can be detected by observing the behavior of the SGLang server when handling concurrent requests involving LoRA adapters. Specifically, if the server crashes or becomes permanently unresponsive, it may be due to this vulnerability.

To reproduce or detect the issue, you can start the SGLang server with the parameter `--max-loras-per-batch N` set, then send concurrent requests that include N LoRA adapters plus at least one base-model request in the same scheduling round. If the server crashes with an assertion error in `lora_manager.fetch_new_loras()` and a SIGQUIT signal, this indicates the vulnerability is present.

Monitoring server logs for assertion failures and SIGQUIT signals related to `fetch_new_loras` can also help detect the issue.

  • Start SGLang server with a per-batch LoRA cap: `sglang_server --max-loras-per-batch N`
  • Send concurrent requests exceeding the LoRA batch limit (N LoRA adapters + 1 base-model request)
  • Check server logs for assertion failures and SIGQUIT signals indicating scheduler crash
Mitigation Strategies

Immediate mitigation involves avoiding conditions that trigger the scheduler crash by not exceeding the `--max-loras-per-batch` limit in concurrent requests.

Until the official fix is accepted and deployed, you should limit the number of concurrent LoRA adapter requests to be at or below the configured batch limit to prevent the scheduler from admitting more LoRAs than allowed.

Applying the fix from the pending pull request, which properly enforces the max_loras_per_batch limit by counting all admitted LoRA requests, is the definitive solution once it is merged and released.

  • Configure the server to use a conservative `--max-loras-per-batch` value and avoid sending concurrent requests that exceed this limit.
  • Monitor server stability and logs for signs of scheduler crashes.
  • Apply the patch from the pull request (PR #25078) once it is available in your SGLang version.
Chat Assistant
Ask questions about this CVE
Hi! I’m here to help you understand CVE-2026-10300. Ask me anything about the vulnerability, its impact, or mitigation strategies.
0/70
EPSS Chart