Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-agent application documentation #541

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

samuelcolvin
Copy link
Member

@samuelcolvin samuelcolvin commented Dec 24, 2024

fix #120
fIx #273
fix #300

Here I've added an example of agent delegation as requested by @Luca-Blight in #120.

There are roughly four levels of complexity when building applications with PydanticAI:

  1. Single agent workflows — what most of this documentation covers
  2. Agent delegation — agents using another agent via tools documented in this PR
  3. Programmatic agent hand-off — one agent runs, then application code calls another agent documented in this PR
  4. Graph based control flow — for the most complex cases, graph and a state machine can be used to control the execution of multiple agents. Work to add graph support is ongoing in Graph Support #528 and pydantic-ai-graph - simplify public generics #539

Copy link

cloudflare-workers-and-pages bot commented Dec 24, 2024

Deploying pydantic-ai with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3ca04f1
Status: ✅  Deploy successful!
Preview URL: https://d09cd231.pydantic-ai.pages.dev
Branch Preview URL: https://agent-delegation.pydantic-ai.pages.dev

View logs

Copy link
Contributor

@hyperlint-ai hyperlint-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 total issue(s) found.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

  • [Dd]ataclass

docs/multi-agent-applications.md Outdated Show resolved Hide resolved
Copy link
Contributor

@hyperlint-ai hyperlint-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style guide flagged several spelling errors that seemed like false positives. We skipped posting inline suggestions for the following words:

  • system_prompt

req_destination='ANC',
req_date=datetime.date(2025, 1, 10),
)
message_history: list[ModelMessage] | None = None
Copy link
Contributor

@dmontagu dmontagu Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit weird to me to include this considering it's not used anywhere, as far as I can tell?

Copy link
Contributor

@dmontagu dmontagu Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, it gets passed in the loop. I think it's worth adding a one-line comment above the while True: explaining what the purpose of having that loop is. If you read farther you can tell but I found myself stuck trying to understand before I had gotten far enough.

await buy_tickets(flight, seat)
break
else:
result.set_result_tool_return('Please suggest another flight')
Copy link
Contributor

@dmontagu dmontagu Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this set_result_tool_return method documented somewhere? I didn't see documentation other than in the docstring (though maybe was looking in the wrong place), and that wasn't enough for me to understand what the purpose of this set_result_tool_return api is. I feel like it might be worth explaining somewhere in the docs (doesn't need to be in this PR), it feels like it could be made more concrete, in particular, if it's updating one of the messages in the history or something.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the API wrong, or just not documented?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand what is happening correctly, I feel like this is a niche enough scenario to not merit a separate API. In particular, it feels very much like it's exposing implementation details to allow you to set the content of the message that was parsed into the result type. Or at least, it feels to me like it adds an API that is only really usable if you fully understand the implementation details (I mean I already feel like I struggle to understand what it does, so I imagine people less familiar with the library will struggle more.)

You could imagine having a validator for the result tool that guarantees something, which could then be changed by this API, which feels unfortunate.

I feel like it makes more sense and generalizes better to, instead of modifying the result message, just add a new message to the message history, i.e., after the line message_history = result.all_messages() adding message_history.append(ModelRequest(parts=[UserPromptPart(content='Please suggest another flight')])). (I don't believe we force the requests and responses to be paired, so I think this should be okay?)

If you have a problem with having two request messages rather than one response and one request, I still think it makes more sense to make it so the analogous API modifies the next request, rather than modifying the previous response. (E.g., in the next call to run, we could modify the latest item if it is a request, so appending a new ModelRequest would not result in two consecutive requests. I don't think that's necessary though, or even if something is I'm not necessarily convinced it's the best solution.)


There are roughly four levels of complexity when building applications with PydanticAI:

1. Single agent workflows — what most of this documentation covers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Single agent workflows — what most of this documentation covers
1. Single agent workflows — what most of the `pydantic_ai` documentation covers

Saying "this documentation" sounds to me like you mean this file, but I think you mean every file except this file, right?

1. Single agent workflows — what most of this documentation covers
2. [Agent delegation](#agent-delegation) — agents using another agent via tools
3. [Programmatic agent hand-off](#programmatic-agent-hand-off) — one agent runs, then application code calls another agent
4. [Graph based control flow](#pydanticai-graphs) — for the most complex cases, graph and a state machine can be used to control the execution of multiple agents
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. [Graph based control flow](#pydanticai-graphs) — for the most complex cases, graph and a state machine can be used to control the execution of multiple agents
4. [Graph based control flow](#pydanticai-graphs) — for the most complex cases, a graph-based state machine can be used to control the execution of multiple agents


## Agent Delegation

The agent delegates work to another agent, but then takes back control when that agent finishes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The agent delegates work to another agent, but then takes back control when that agent finishes.
"Agent delegation" refers to the scenario where an agent delegates work to another agent, but then takes back control when that agent finishes.


Since agents are stateless and designed to be global, you do not need to include the agent itself in agent [dependencies](dependencies.md).

When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.
When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of the delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of the parent agent run.

When doing so, you'll generally want to pass [`ctx.usage`][pydantic_ai.RunContext.usage] to the [`usage`][pydantic_ai.Agent.run] keyword argument of delegate agent (the agent called from within a tool) run so usage within that run counts towards the total usage of a parent agent run.

!!! Multiple models
Agent delegation doesn't need to use the same model for each agent. If you choose to use different models within a run, calculating the monetary cost from the final [`result.usage()`][pydantic_ai.result.RunResult.usage] of the run will not be possible, but you can still use [`UsageLimits`][pydantic_ai.usage.UsageLimits] to avoid unexpected costs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me feel like we should have a way to tally usage on a per-model basis. Of course that's well outside the scope of this PR.

Comment on lines +63 to +71
```mermaid
graph TD
START --> joke_agent
joke_agent --> joke_factory["joke_factory (tool)"]
joke_factory --> delegate_agent
delegate_agent --> joke_factory
joke_factory --> joke_agent
joke_agent --> END
```
Copy link
Contributor

@dmontagu dmontagu Dec 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the names used in this scenario are a bit confusing, in particular using the term joke_agent and delegate_agent is awkward since ultimately the delegate agent is the one producing the jokes, and the joke_agent just selects one.

Maybe it would be better to rename joke_agent to joke_selector_agent, and delegate_agent to joke_generator_agent? And then you can just explain that, conceptually, the joke_selector_agent is "delegating" to the joke_generator_agent by way of the joke_factory tool.


### Agent Delegation and dependencies.

The delegate agent needs to either have the same [dependencies](dependencies.md) as the calling agent, or dependencies which are a subset of the calling agent's dependencies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't strictly speaking true, it just needs to be the case that the required dependencies can be instantiated at the call site of the delegate agent. In particular, the delegate agent can make use of global dependencies, or functions that can be called in any context.

For example, if you have a globally-accessible pydantic_settings settings object that reads from environment, you could use that to get a connection string that you then use to instantiate a database connection, which is provided to the delegate agent via its deps, even though it didn't come from the parent agent's deps. And this feels like a somewhat realistic scenario to me; of course users could pass a connection pool as a dep to the parent agent, but it's not necessary and would probably be skipped at least during prototyping.

Ultimately this isn't a big deal, but my point is just that I think if we don't make this point in some way, it may leave users confused about the mental model they should have about how they can build dependencies. In particular, the deps from the parent agent are not passed directly to the delegate agent through some complex and opaque mechanism (which might be my intuition from this sentence) — they are just instantiated in the call to the delegate agent.


_(This example is complete, it can be run "as is")_

The control flow for this example shows how even a fairly simple agent delegation leads to a fairly complex flow:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The control flow for this example shows how even a fairly simple agent delegation leads to a fairly complex flow:
This example shows how even a fairly simple agent delegation can lead to a complex control flow:

(wanted to remove the repetition of "fairly" but also could simplify the sentence in other ways)


## Programmatic agent hand-off

Multiple agents are called in succession, with application code and/or human in the loop responsible for deciding which agent to call next.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Multiple agents are called in succession, with application code and/or human in the loop responsible for deciding which agent to call next.
"Programmatic agent hand-off" refers to the scenario where multiple agents are called in succession, with application code and/or a human in the loop responsible for deciding which agent to call next.

#> Seat preference: row=1 seat='A'
```

1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.
1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structured results](results.md#structured-result-validation). We use a union as the result type so the model can communicate if it's unable to find a satisfactory choice; internally, each member of the union will be registered as a separate tool.

```

1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.
2. Define a tool on the agent to find a flight, in this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Define a tool on the agent to find a flight, in this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.
2. Define a tool on the agent to find a flight. In this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.

1. Define the first agent, which finds a flight. We use an explicit type annotation until PEP 747 lands, see [structure results](results.md#structured-result-validation). We a union as the result type so the model can communicate that it's unable to find a satisfactory choice, internally each member of the union will be registered as a separate tool.
2. Define a tool on the agent to find a flight, in this simple case we could dispense with the tool and just define the agent to return structured data, then search for a flight, but in more complex scenarios the tool would be necessary.
3. Define usage limits for the entire app.
4. Define a function to find a flight, which ask the user for their preferences and then calls the agent to find a flight.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Define a function to find a flight, which ask the user for their preferences and then calls the agent to find a flight.
4. Define a function to find a flight, which asks the user for their preferences and then calls the agent to find a flight.

4. Define a function to find a flight, which ask the user for their preferences and then calls the agent to find a flight.
5. As with `flight_search_agent` above, we use an explicit type annotation to define the agent.
6. Define a function to find the user's seat preference, which asks the user for their seat preference and then calls the agent to extract the seat preference.
7. Now we've put our logic for running each agent into separate functions, our main app becomes very simple.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
7. Now we've put our logic for running each agent into separate functions, our main app becomes very simple.
7. Now that we've put our logic for running each agent into separate functions, our main app becomes very simple.

_usage: Usage

def usage(self) -> Usage:
"""Return the usage of the whole run."""
return self._usage

def set_result_tool_return(self, return_content: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see this was added in this PR, still, I don't really understand how it's meant to be used

@HamzaFarhan
Copy link

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it?
I'm guessing there are 2 possible approaches:

  1. Adding messages/key+values using dependency injection.
  2. Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

@HamzaFarhan
Copy link

In these patterns, we assume that each subsequent agent will receive just the right amount of information to complete its task. Can we have a way of passing all of the context so far and letting the agent use whatever it wants from it? I'm guessing there are 2 possible approaches:

  1. Adding messages/key+values using dependency injection.
  2. Passing the messages throughout the "team" of agents and accumulating them. Especially after Configuration and parameters for all_messages() and new_messages() #496

This would also be useful when an agent returns its final response to the main/supervisor/delegator agent and then the main agent can know what went down.

@jacobweiss2305
Copy link

Love it thank you for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants