Subgraph checkpointer=True causes subgraph to be skipped #3206

shengbo-ma · 2025-01-26T03:21:43Z

Checked other resources

This is a bug, not a usage question. For questions, please use GitHub Discussions.
I added a clear and detailed title that summarizes the issue.
I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

from typing import Literal

from langchain_core.runnables import RunnableConfig
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, START, StateGraph
from langgraph.graph.state import Command
from langgraph.types import interrupt
from rich import get_console
from typing_extensions import TypedDict

###############
# Subgraph
###############


class SubGraphState(TypedDict, total=False):
    parent_counter: int
    sub_counter: int


def subgraph_accumulator(state: SubGraphState) -> SubGraphState:
    get_console().print("---subgraph counter node---")
    get_console().print(f"{state = }")
    # ask for human approval
    human_feedback = interrupt("get human feedback")
    print(f"{human_feedback = }")

    # continue counting
    sub_counter = state["sub_counter"] + 1 if "sub_counter" in state else 1
    return {"sub_counter": sub_counter}


sub_graph = (
    StateGraph(SubGraphState)
    .add_node(subgraph_accumulator)
    .add_edge(START, subgraph_accumulator.__name__)
    .add_edge(subgraph_accumulator.__name__, END)
    .compile(
        checkpointer=True,  # BUG: This causes an issue that subgraph nodes are not executed at all after first interruption
    )
)
sub_graph.name = "sub"

###############
# Parent Graph
###############

MAX_ITERATION = 3


class ParentGraphState(TypedDict):
    parent_counter: int


def parent_graph_accumulator(
    state: ParentGraphState,
) -> Command[Literal["sub", "__end__"]]:
    print("---parent counter node---")
    get_console().print(f"{state = }")
    parent_counter = state["parent_counter"] + 1 if "parent_counter" in state else 0

    # goto end when max iteration reaches
    goto = sub_graph.get_name() if parent_counter < MAX_ITERATION else END
    get_console().print(f"going to node {goto}")
    return Command(
        update={
            "parent_counter": parent_counter,
        },
        goto=goto,
    )


parent_agent = (
    StateGraph(ParentGraphState)
    .add_node(parent_graph_accumulator)
    .add_node(sub_graph)
    .add_edge(START, parent_graph_accumulator.__name__)
    .add_edge(sub_graph.get_name(), parent_graph_accumulator.__name__)
    .compile(checkpointer=MemorySaver())
)

# visualize graph
mermaid_graph = parent_agent.get_graph(xray=True).draw_mermaid()
print(mermaid_graph)

###############
# Conversation
###############

config: RunnableConfig = {"configurable": {"thread_id": "42"}, "recursion_limit": MAX_ITERATION+1}

inputs = [
    ParentGraphState(parent_counter=0),
    Command(resume="human feedback 1"),
    Command(resume="human feedback 2"),
]
for input_ in inputs:
    print(f"{input_ = }")
    for event in parent_agent.stream(
        # resume the conversation
        input_,
        config,
        stream_mode="updates",
        subgraphs=True,
    ):
        print("Streaming event ...")
        print(event)

Error Message and Stack Trace (if applicable)

input_ = {'parent_counter': 0}
---parent counter node---
state = {'parent_counter': 0}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 1}})
---subgraph counter node---
state = {'parent_counter': 1}
Streaming event ...
((), {'__interrupt__': (Interrupt(value='get human feedback', resumable=True, ns=['sub', 'subgraph_accumulator:f187d019-da4b-d432-bcd2-cea142aa7e35'], when='during'),)})
input_ = Command(resume='human feedback 1')
---subgraph counter node---
state = {'parent_counter': 1}
human_feedback = 'human feedback 1'
Streaming event ...
(('sub',), {'subgraph_accumulator': {'sub_counter': 1}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})  <------- BUG: should be subgraph execution like (('sub',), {'subgraph_accumulator': {...}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Traceback (most recent call last):
  File "/home/linux/arcgis-ai-assistants/python/arcgis-assistant/.tmp/subgraph_state_lose/loop_subgraph_with_interrupt.py", line 99, in <module>
    for event in parent_agent.stream(
  File "/home/linux/miniconda3/envs/test/lib/python3.11/site-packages/langgraph/pregel/__init__.py", line 1690, in stream
    raise GraphRecursionError(msg)
langgraph.errors.GraphRecursionError: Recursion limit of 4 reached without hitting a stop condition. You can increase the limit by setting the `recursion_limit` config key.
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/GRAPH_RECURSION_LIMIT

Description

I encountered an issue when building a multi-agent graph for multi-turn conversations, where a subgraph has a human feedback node. It interrupts and takes human feedback.

Here is an example graph to reproduce the issue

A parent graph has a loop, calling a sub graph until the parent counter reaches a pre-defined limit.
Subgraph node interrupts and take human feedback.
The sub graph should remember its state from previous run (checkpointer=True).

Expected Behavior

graph should interrupt twice, and resume with human inputs
subgraph should should persist its state on each run (since checkpointer=True)

Actual Behavior

The first interrupt and resume is as expected
The second interrupt never happens. The parent graph never executes sub graph counter node after resuming the first interrupt. The sub graph nodes output the same parent counter (=1) repeatedly, leading to recursion limit error since parent counter does not increase.

Observation
If removing checkpointer=True, the graph executes as expected, i.e. the parent counter increases correctly. No bug. (In this cause, ff course, the sub graph states from previous run is not persisted )
It seems in subgraph assigning checkpointer=True and calling interrupt conflicts in some way.

%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
        __start__([<p>__start__</p>]):::first
        parent_graph_accumulator(parent_graph_accumulator)
        sub(sub)
        __end__([<p>__end__</p>]):::last
        __start__ --> parent_graph_accumulator;
        sub --> parent_graph_accumulator;
        parent_graph_accumulator -.-> sub;
        parent_graph_accumulator -.-> __end__;
        classDef default fill:#f2f0ff,line-height:1.2
        classDef first fill-opacity:0
        classDef last fill:#bfb6fc

LangGraph Version
0.2.67

System Info

System Information

OS: Linux
OS Version: #135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024
Python Version: 3.11.11 (main, Dec 11 2024, 16:28:39) [GCC 11.2.0]

Package Information

langchain_core: 0.3.31
langsmith: 0.3.1
langgraph_sdk: 0.1.51

Optional packages not installed

langserve

Other Dependencies

httpx: 0.28.1
jsonpatch: 1.33
langsmith-pyo3: Installed. No version info available.
orjson: 3.10.15
packaging: 24.2
pydantic: 2.10.6
pytest: Installed. No version info available.
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
rich: 13.9.4
tenacity: 9.0.0
typing-extensions: 4.12.2
zstandard: 0.23.0

The text was updated successfully, but these errors were encountered:

shengbo-ma · 2025-01-26T03:26:05Z

Since checkpointer=True is not yet officially documented and supported, not sure if I use it in a wrong way.
Could you please take a look? @vbarda @eyurtsev
Appreciate your efforts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subgraph checkpointer=True causes subgraph to be skipped #3206

Subgraph checkpointer=True causes subgraph to be skipped #3206

shengbo-ma commented Jan 26, 2025 •

edited

Loading

shengbo-ma commented Jan 26, 2025 •

edited

Loading

Subgraph checkpointer=True causes subgraph to be skipped #3206

Subgraph checkpointer=True causes subgraph to be skipped #3206

Comments

shengbo-ma commented Jan 26, 2025 • edited Loading

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

shengbo-ma commented Jan 26, 2025 • edited Loading

shengbo-ma commented Jan 26, 2025 •

edited

Loading

shengbo-ma commented Jan 26, 2025 •

edited

Loading