Skip to content

Commit

Permalink
Merge branch 'main' into async_run
Browse files Browse the repository at this point in the history
  • Loading branch information
whimo committed Jun 3, 2024
2 parents 73356b2 + 05c3a32 commit 7a03448
Show file tree
Hide file tree
Showing 13 changed files with 223 additions and 131 deletions.
22 changes: 14 additions & 8 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ jobs:
build:
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12"]
os: [ubuntu-latest, macos-latest, windows-latest]
python-version: ["3.12"]
os: [ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
Expand All @@ -40,12 +40,18 @@ jobs:
path: .venv
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/poetry.lock') }}

- name: Install dependencies
if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction --no-root

- name: Install project
run: poetry install --no-interaction
run: poetry install --no-interaction --with dev --all-extras

- name: Run tests
- name: Run build
run: poetry build

- name: Install pandoc
working-directory: ./docs/source
run: poetry run python install_pandoc.py

- name: Run docs build
env:
TZ: UTC
working-directory: ./docs
run: poetry run make html
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 motleycrew

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
38 changes: 0 additions & 38 deletions docs/source/_autosummary/motleycrew.rst

This file was deleted.

Binary file added docs/source/images/crew_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/source/install_pandoc.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
import os
import shutil
from pypandoc.pandoc_download import download_pandoc

pandoc_location = os.path.abspath("../../.venv/_pandoc")

with open(os.environ["GITHUB_PATH"], "a") as path:
path.write(str(pandoc_location) + "\n")

if not shutil.which("pandoc"):
download_pandoc(targetfolder=pandoc_location)
124 changes: 124 additions & 0 deletions docs/source/key_concepts.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
Key Concepts and API
====================

This is an overview of motleycrew's key concepts.

If you want to see them in action, see our `research agent example <examples/research_agent.html>`_.

For a basic introduction, you can check out the `quickstart <quickstart.html>`_.


Crew and knowledge graph
------------------------

The crew is a central concept in motleycrew. It is the orchestrator that knows what tasks sould be done in which order,
and manages the execution of those tasks.

The crew has an underlying knowledge graph, in which it stores all information relevant to the execution of the tasks.
Besides storing the tasks themselves, the knowledge graph can act as a universal storage for any kind of context
that is relevant to the tasks. You can find more info on how to use the knowledge graph in the `tutorial <knowledge_graph.html>`_.

We currently use `Kùzu <https://kuzudb.com/>`_ as a knowledge graph backend because it's embeddable,
available under an MIT license, and is one of the LlamaIndex-supported KG backends -
please raise an issue on GitHub if you'd like us to support others.

The relationships between tasks are automatically stored in the KG backend; but the agents that are working
on the tasks can also read and write any other context they want to share.

.. code-block:: python
from motleycrew import MotleyCrew
crew = MotleyCrew()
crew.graph_store
# MotleyKuzuGraphStore(path=/path/to/kuzu_db)
If you want to persist the data or otherwise customize the graph store, you can pass a graph store instance to the crew.

.. code-block:: python
import kuzu
from motleycrew.storage import MotleyKuzuGraphStore
database = kuzu.Database(database_path="kuzu_db")
graph_store = MotleyKuzuGraphStore(database=database)
crew = MotleyCrew(graph_store=graph_store)
Tasks, task units, and workers
------------------------------

In motleycrew, a **task** is a body of work that is carried out according to certain rules. The task provides the crew
with a description of what needs to be done in the form of **task units**, and who must do it - that's called a
**worker**. A worker can be an agent, a tool, or for that matter any Runnable (in the Langchain sense).

The worker receives a task unit as an input, processes it, and returns a result.

In a simple case, a task will have a single task unit, and becomes completed as soon as the unit is done.
For such cases, motleycrew provides a `SimpleTask` class, which basically contains an agent and a prompt.
Refer to the `blog with images <examples/blog_with_images.html>`_ example for a more elaborate illustration.

.. code-block:: python
from motleycrew.tasks import SimpleTask
crew = MotleyCrew()
agent = ...
task = SimpleTask(crew=crew, agent=agent, name="example task", description="Do something")
crew.run()
print(task.output)
This task is basically a prompt ("Do something") that is fed to the provided agent. The task will be completed as
soon as the agent finishes processing the only task unit.

For describing more complex tasks, you should subclass the `Task` class. It has two abstract
methods that you should implement: ``get_next_unit`` and ``get_worker``, as well as some optional methods
that you can override to customize the task's behavior.

#. ``get_next_unit`` should return the next task unit to be processed. If there are no units to do at the moment, it should return `None`.
#. ``get_worker`` should return the worker (typically an agent) that will process the task's units.
#. `optional` ``register_started_unit`` is called by the crew when a task unit is dispatched. By default, it just connects the unit to the task in the graph.
#. `optional` ``register_completed_unit`` is called by the crew when a task unit is completed. By default, it does nothing.


Task hierarchy
--------------

Tasks can be set to depend on other tasks, forming a directed acyclic graph. This is done by either calling a
task's ``set_upstream`` method or by using the ``>>`` operator. The crew will then make sure that the upstream
tasks are completed before starting the dependent task, and pass the former's output to the latter.

.. code-block:: python
task1 = SimpleTask(crew=crew, agent=agent, name="first task", description="Do something")
task2 = SimpleTask(crew=crew, agent=agent, name="second task", description="Do something else")
task1 >> task2
crew.run()
How the crew handles tasks
--------------------------

The crew queries the tasks for task units and dispatches them in a loop. The crew will keep running until either all
tasks are completed or available tasks stop providing task units.

A task is considered completed when it has ``done`` attribute set to ``True``. For example, in the case of `SimpleTask`,
this happens when its only task unit is completed and the crew calls the task's ``register_completed_unit`` method.
In case of a custom task, this behavior is up to the task's implementation.

Available tasks are defined as tasks that have not been completed and have no incomplete
upstream tasks. On each iteration, available tasks are queried for task units one by one,
and the crew will dispatch the task unit to the worker that the task provides.

When a task unit is dispatched, the crew adds it to the knowledge graph and calls the task's ``register_started_unit``
method. When the worker finishes processing the task unit, the crew calls the task's ``register_completed_unit`` method.

.. image:: images/crew_diagram.png
:alt: Crew main loop
:align: center

Now that you know the basics, we suggest you check out the `research agent example <examples/research_agent.html>`_
to see how it all works together.
65 changes: 0 additions & 65 deletions examples/Key Concepts and API.ipynb

This file was deleted.

20 changes: 20 additions & 0 deletions examples/Multi-step research agent.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,26 @@
"When we decide we've done this for long enough (currently just a constraint on the number of nodes), we then walk back up the graph, first answering the leaf questions, then using these answers (along with the context retrieved for their parent question) to answer the parent question, etc. "
]
},
{
"cell_type": "markdown",
"id": "98c1418b",
"metadata": {},
"source": [
"Technically speaking, the flow consists of two tasks, `QuestionTask` and `AnswerTask`. The `QuestionTask` starts with a user question, and embeds this into the graph as the first un-answered question. Its `get_next_unit()` method looks up all the as yet un-answered questions, and chooses the one that's most salient to the original question (so that question is the `TaskUnit` it returns). Its worker then retrieves the context (RAG-style) for that question, but instead of answering it, creates up to 3 further questions that would be most helpful to answer in order to answer the original question. We thus build up a tree of questions, where each non-leaf node has a retrieval context attached to it - all stored in the knowledge graph for easy retrieval. This goes on until we have enough questions (currently just a fixed number of iterations).\n",
"\n",
"The `AnswerTask` then rolls the tree back up. It ignores all the questions without a retrieved context; and the `TaskUnit` that its `get_next_unit()` returns is then any question that has no un-answered children. Its worker then proceeds to answer that question using its retrieved context and the answers from its children, if any. This goes on until we've worked our way back up to answering the original question."
]
},
{
"cell_type": "markdown",
"id": "6cbe252b",
"metadata": {},
"source": [
"This shows how the tasks can create `TaskUnit`s for themselves and for each other, which enables a whole new level of self-organization. \n",
"\n",
"The different `Task`s don't have to all form part of a connected DAG either. For example, two tasks could take turns creating `TaskUnit`s for one another - just one of many interaction patterns possible within the architecture."
]
},
{
"cell_type": "code",
"execution_count": 2,
Expand Down
26 changes: 7 additions & 19 deletions examples/Quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,11 @@
"source": [
"The functionality so far is convenient, allowing us to mix all the popular agents and tools, but otherwise fairly vanilla, little different from, for example, the CrewAI semantics. Fortunately, the above introduction just scratched the surface of the motleycrew `Task` API.\n",
"\n",
"Each crew is automatically given an embedded [knowledge graph backend](knowledge_graph.html). We currently use [kuzu](https://github.com/kuzudb/kuzu) because it's embeddable, available under an MIT license, and is one of the LlamaIndex-supported KG backends - please raise an issue on GitHub if you'd like us to support others.\n",
"The relationships between tasks are automatically stored in the KG backend; but the agents that are working on the tasks can also read and write any other context they want to share.\n",
"In motleycrew, a task is basically a set of rules describing how to perform actions. It provides a **worker** (e.g. an agent) and sets of input data called **task units**. This allows defining workflows of any complexity concisely using crew semantics. For a deeper dive, check out the page on [key concepts](key_concepts.html).\n",
"\n",
"A `Task` object must implement only two methods: `get_next_unit()` and `get_worker()`. The former returns a data object (`TaskUnit`) describing the next part of the task to be done (or `None` if there is nothing to be done for that particular `Task` at the moment), and the latter returns the worker (typically an agent) that this data object can be given to for execution. The crew keeps querying all available `Task`s for `TaskUnits` and dispatching them, until done.\n",
"The crew queries and dispatches available task units in a loop, managing task states using an embedded [knowledge graph](knowledge_graph.html).\n",
"\n",
"You can see how this dispatch method easily supports different execution backends, from synchronous to asyncio, threaded, etc.\n"
"This dispatch method easily supports different execution backends, from synchronous to asyncio, threaded, etc.\n"
]
},
{
Expand All @@ -94,29 +93,18 @@
"source": [
"### Example: Recursive question-answering in the research agent\n",
"\n",
"An example of the power of this approach is the [research agent](examples/research_agent.html). It consists of two tasks, `QuestionTask` and `AnswerTask`. The `QuestionTask` starts with a user question, and embeds this into the graph as the first un-answered question. Its `get_next_unit()` method looks up all the as yet un-answered questions, and chooses the one that's most salient to the original question (so that question is the `TaskUnit` it returns). Its worker then retrieves the context (RAG-style) for that question, but instead of answering it, creates up to 3 further questions that would be most helpful to answer in order to answer the original question. We thus build up a tree of questions, where each non-leaf node has a retrieval context attached to it - all stored in the knowledge graph for easy retrieval. This goes on until we have enough questions (currently just a fixed number of iterations).\n",
"\n",
"The `AnswerTask` then rolls the tree back up. It ignores all the questions without a retrieved context; and the `TaskUnit` that its `get_next_unit()` returns is then any question that has no un-answered children. Its worker then proceeds to answer that question using its retrieved context and the answers from its children, if any. This goes on until we've worked our way back up to answering the original question.\n",
"Motleycrew architecture described above easily allows to generate task units on the fly, if needed. An example of the power of this approach is the [research agent](examples/research_agent.html) that dynamically generates new questions based on retrieved context for previous questions. \n",
"This example also shows how workers can collaborate via the shared knowledge graph, storing all necessary data in a way that is natural to the task.\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "1762f89f-96c3-4e93-ba2f-4aa8accfb14a",
"id": "2cafa282-2111-4051-bf0f-7046048648bd",
"metadata": {},
"source": [
"This shows how the tasks can create `TaskUnit`s for themselves and for each other, which enables a whole new level of self-organization. \n",
"\n",
"The different `Task`s don't have to all form part of a connected DAG either. For example, two tasks could take turns creating `TaskUnit`s for one another - just one of many interaction patterns possible within the architecture."
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "220c0707-64f8-4415-b9b5-2b730672b5b7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand Down
2 changes: 1 addition & 1 deletion motleycrew/agents/langchain/langchain.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ def from_function(
llm = init_llm(llm_framework=LLMFramework.LANGCHAIN)

if require_tools and not tools:
raise ValueError("You must provide at least one tool to the ReactMotleyAgent")
raise ValueError("You must provide at least one tool to the LangchainMotleyAgent")

def agent_factory(tools: dict[str, MotleyTool]):
langchain_tools = [t.to_langchain_tool() for t in tools.values()]
Expand Down
1 change: 1 addition & 0 deletions motleycrew/common/enums.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ class LLMFamily:
"""
OPENAI = "openai"
ANTHROPIC = "anthropic"


class LLMFramework:
Expand Down
Loading

0 comments on commit 7a03448

Please sign in to comment.