CVE-2026-22691
Denial of Service via Malformed startxref in pypdf
Publication date: 2026-01-10
Last updated on: 2026-01-10
Assigner: GitHub, Inc.
Description
Description
CVSS Scores
EPSS Scores
| Probability: | |
| Percentile: |
Meta Information
Affected Vendors & Products
| Vendor | Product | Version / Range |
|---|---|---|
| py-pdf | pypdf | to 6.6.0 (exc) |
Helpful Resources
Exploitability
| CWE ID | Description |
|---|---|
| CWE-400 | The product does not properly control the allocation and maintenance of a limited resource. |
| CWE-1333 | The product uses a regular expression with an inefficient, possibly exponential worst-case computational complexity that consumes excessive CPU cycles. |
Attack-Flow Graph
AI Powered Q&A
What immediate steps should I take to mitigate this vulnerability?
Immediate mitigation steps include upgrading the pypdf library to version 6.6.0 or later, where this vulnerability has been fixed. If upgrading is not immediately possible, enable strict mode by initializing PdfReader and PdfWriter with the parameter strict=True, which avoids the inefficient processing path that leads to long runtimes. Additionally, avoid processing untrusted or malformed PDF files in non-strict mode until the patch is applied. [1]
Can you explain this vulnerability to me?
This vulnerability in the pypdf Python library (prior to version 6.6.0) involves inefficient handling of malformed PDF files with invalid startxref entries when using non-strict reading mode. Specifically, PDFs containing large amounts of whitespace cause the rebuilding of the cross-reference table to take excessively long due to a regex-based search with potentially exponential worst-case complexity. An attacker can craft such malicious PDFs to cause high CPU usage and degraded performance. The issue has been fixed in version 6.6.0 by replacing the regex search with a more efficient manual search and adding other robustness improvements. [1, 3, 4]
How can this vulnerability impact me? :
If you use a vulnerable version of pypdf in non-strict mode to process PDF files, an attacker can supply a specially crafted PDF that causes the library to consume excessive CPU resources and take a very long time to process the file. This can lead to degraded application performance, potential denial-of-service conditions, and resource exhaustion on systems handling untrusted PDFs. [1, 4]
How can this vulnerability be detected on my network or system? Can you suggest some commands?
This vulnerability can be detected by monitoring for unusually high CPU usage or long runtimes when processing PDF files with pypdf in non-strict reading mode, especially when handling PDFs with malformed or invalid startxref entries and excessive whitespace. To detect it, you can test processing suspicious PDF files using pypdf with non-strict mode enabled and observe performance. As a workaround, enabling strict mode (strict=True) in PdfReader and PdfWriter avoids the inefficient processing path. Specific commands include running a Python script that uses pypdf to open and parse PDFs with strict=False and monitoring CPU usage or execution time. For example: ```python from pypdf import PdfReader reader = PdfReader('suspicious.pdf', strict=False) ``` Monitor the runtime or CPU consumption during this operation. If the runtime is excessively long or CPU usage spikes, the vulnerability may be triggered. [1]