Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subgraph checkpointer=True causes subgraph to be skipped #3206

Open
4 tasks done
shengbo-ma opened this issue Jan 26, 2025 · 1 comment
Open
4 tasks done

Subgraph checkpointer=True causes subgraph to be skipped #3206

shengbo-ma opened this issue Jan 26, 2025 · 1 comment

Comments

@shengbo-ma
Copy link

shengbo-ma commented Jan 26, 2025

Checked other resources

  • This is a bug, not a usage question. For questions, please use GitHub Discussions.
  • I added a clear and detailed title that summarizes the issue.
  • I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
  • I included a self-contained, minimal example that demonstrates the issue INCLUDING all the relevant imports. The code run AS IS to reproduce the issue.

Example Code

from typing import Literal

from langchain_core.runnables import RunnableConfig
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, START, StateGraph
from langgraph.graph.state import Command
from langgraph.types import interrupt
from rich import get_console
from typing_extensions import TypedDict

###############
# Subgraph
###############


class SubGraphState(TypedDict, total=False):
    parent_counter: int
    sub_counter: int


def subgraph_accumulator(state: SubGraphState) -> SubGraphState:
    get_console().print("---subgraph counter node---")
    get_console().print(f"{state = }")
    # ask for human approval
    human_feedback = interrupt("get human feedback")
    print(f"{human_feedback = }")

    # continue counting
    sub_counter = state["sub_counter"] + 1 if "sub_counter" in state else 1
    return {"sub_counter": sub_counter}


sub_graph = (
    StateGraph(SubGraphState)
    .add_node(subgraph_accumulator)
    .add_edge(START, subgraph_accumulator.__name__)
    .add_edge(subgraph_accumulator.__name__, END)
    .compile(
        checkpointer=True,  # BUG: This causes an issue that subgraph nodes are not executed at all after first interruption
    )
)
sub_graph.name = "sub"

###############
# Parent Graph
###############

MAX_ITERATION = 3


class ParentGraphState(TypedDict):
    parent_counter: int


def parent_graph_accumulator(
    state: ParentGraphState,
) -> Command[Literal["sub", "__end__"]]:
    print("---parent counter node---")
    get_console().print(f"{state = }")
    parent_counter = state["parent_counter"] + 1 if "parent_counter" in state else 0

    # goto end when max iteration reaches
    goto = sub_graph.get_name() if parent_counter < MAX_ITERATION else END
    get_console().print(f"going to node {goto}")
    return Command(
        update={
            "parent_counter": parent_counter,
        },
        goto=goto,
    )


parent_agent = (
    StateGraph(ParentGraphState)
    .add_node(parent_graph_accumulator)
    .add_node(sub_graph)
    .add_edge(START, parent_graph_accumulator.__name__)
    .add_edge(sub_graph.get_name(), parent_graph_accumulator.__name__)
    .compile(checkpointer=MemorySaver())
)

# visualize graph
mermaid_graph = parent_agent.get_graph(xray=True).draw_mermaid()
print(mermaid_graph)

###############
# Conversation
###############

config: RunnableConfig = {"configurable": {"thread_id": "42"}, "recursion_limit": MAX_ITERATION+1}

inputs = [
    ParentGraphState(parent_counter=0),
    Command(resume="human feedback 1"),
    Command(resume="human feedback 2"),
]
for input_ in inputs:
    print(f"{input_ = }")
    for event in parent_agent.stream(
        # resume the conversation
        input_,
        config,
        stream_mode="updates",
        subgraphs=True,
    ):
        print("Streaming event ...")
        print(event)

Error Message and Stack Trace (if applicable)

input_ = {'parent_counter': 0}
---parent counter node---
state = {'parent_counter': 0}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 1}})
---subgraph counter node---
state = {'parent_counter': 1}
Streaming event ...
((), {'__interrupt__': (Interrupt(value='get human feedback', resumable=True, ns=['sub', 'subgraph_accumulator:f187d019-da4b-d432-bcd2-cea142aa7e35'], when='during'),)})
input_ = Command(resume='human feedback 1')
---subgraph counter node---
state = {'parent_counter': 1}
human_feedback = 'human feedback 1'
Streaming event ...
(('sub',), {'subgraph_accumulator': {'sub_counter': 1}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})  <------- BUG: should be subgraph execution like (('sub',), {'subgraph_accumulator': {...}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Streaming event ...
((), {'sub': {'parent_counter': 1}})
---parent counter node---
state = {'parent_counter': 1}
going to node sub
Streaming event ...
((), {'parent_graph_accumulator': {'parent_counter': 2}})
Traceback (most recent call last):
  File "/home/linux/arcgis-ai-assistants/python/arcgis-assistant/.tmp/subgraph_state_lose/loop_subgraph_with_interrupt.py", line 99, in <module>
    for event in parent_agent.stream(
  File "/home/linux/miniconda3/envs/test/lib/python3.11/site-packages/langgraph/pregel/__init__.py", line 1690, in stream
    raise GraphRecursionError(msg)
langgraph.errors.GraphRecursionError: Recursion limit of 4 reached without hitting a stop condition. You can increase the limit by setting the `recursion_limit` config key.
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/GRAPH_RECURSION_LIMIT

Description

I encountered an issue when building a multi-agent graph for multi-turn conversations, where a subgraph has a human feedback node. It interrupts and takes human feedback.

Here is an example graph to reproduce the issue

  • A parent graph has a loop, calling a sub graph until the parent counter reaches a pre-defined limit.
  • Subgraph node interrupts and take human feedback.
  • The sub graph should remember its state from previous run (checkpointer=True).

Expected Behavior

  • graph should interrupt twice, and resume with human inputs
  • subgraph should should persist its state on each run (since checkpointer=True)

Actual Behavior

  • The first interrupt and resume is as expected
  • The second interrupt never happens. The parent graph never executes sub graph counter node after resuming the first interrupt. The sub graph nodes output the same parent counter (=1) repeatedly, leading to recursion limit error since parent counter does not increase.

Observation
If removing checkpointer=True, the graph executes as expected, i.e. the parent counter increases correctly. No bug. (In this cause, ff course, the sub graph states from previous run is not persisted )
It seems in subgraph assigning checkpointer=True and calling interrupt conflicts in some way.

%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
        __start__([<p>__start__</p>]):::first
        parent_graph_accumulator(parent_graph_accumulator)
        sub(sub)
        __end__([<p>__end__</p>]):::last
        __start__ --> parent_graph_accumulator;
        sub --> parent_graph_accumulator;
        parent_graph_accumulator -.-> sub;
        parent_graph_accumulator -.-> __end__;
        classDef default fill:#f2f0ff,line-height:1.2
        classDef first fill-opacity:0
        classDef last fill:#bfb6fc
Loading

LangGraph Version
0.2.67

System Info

System Information

OS: Linux
OS Version: #135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024
Python Version: 3.11.11 (main, Dec 11 2024, 16:28:39) [GCC 11.2.0]

Package Information

langchain_core: 0.3.31
langsmith: 0.3.1
langgraph_sdk: 0.1.51

Optional packages not installed

langserve

Other Dependencies

httpx: 0.28.1
jsonpatch: 1.33
langsmith-pyo3: Installed. No version info available.
orjson: 3.10.15
packaging: 24.2
pydantic: 2.10.6
pytest: Installed. No version info available.
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
rich: 13.9.4
tenacity: 9.0.0
typing-extensions: 4.12.2
zstandard: 0.23.0

@shengbo-ma
Copy link
Author

shengbo-ma commented Jan 26, 2025

Since checkpointer=True is not yet officially documented and supported, not sure if I use it in a wrong way.
Could you please take a look? @vbarda @eyurtsev
Appreciate your efforts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant