diff --git a/docs/source/basic_api.nblink b/docs/source/basic_api.nblink deleted file mode 100644 index 68f65d22..00000000 --- a/docs/source/basic_api.nblink +++ /dev/null @@ -1,3 +0,0 @@ -{ - "path": "../../examples/Basic introduction.ipynb" -} \ No newline at end of file diff --git a/docs/source/examples.rst b/docs/source/examples.rst index c8a4eed3..a6d38e54 100644 --- a/docs/source/examples.rst +++ b/docs/source/examples.rst @@ -7,5 +7,6 @@ Examples examples/blog_with_images examples/research_agent - examples/output_handlers + examples/validating_agent_output + examples/advanced_output_handling autogen diff --git a/docs/source/examples/advanced_output_handling.nblink b/docs/source/examples/advanced_output_handling.nblink new file mode 100644 index 00000000..f1a54cbe --- /dev/null +++ b/docs/source/examples/advanced_output_handling.nblink @@ -0,0 +1,3 @@ +{ + "path": "../../../examples/Advanced output handling.ipynb" +} diff --git a/docs/source/examples/output_handlers.nblink b/docs/source/examples/output_handlers.nblink deleted file mode 100644 index a13fe3ff..00000000 --- a/docs/source/examples/output_handlers.nblink +++ /dev/null @@ -1,3 +0,0 @@ -{ - "path": "../../../examples/Output handlers for validating agent output.ipynb" -} diff --git a/docs/source/examples/validating_agent_output.nblink b/docs/source/examples/validating_agent_output.nblink new file mode 100644 index 00000000..22f04e42 --- /dev/null +++ b/docs/source/examples/validating_agent_output.nblink @@ -0,0 +1,3 @@ +{ + "path": "../../../examples/Validating agent output.ipynb" +} diff --git a/examples/Advanced output handling.ipynb b/examples/Advanced output handling.ipynb new file mode 100644 index 00000000..ca4bd544 --- /dev/null +++ b/examples/Advanced output handling.ipynb @@ -0,0 +1,566 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Advanced output handling\n", + "\n", + "\n", + "For grasping the basic idea of output handlers, check out this introduction: [validating agent output](validating_agent_output.html).\n", + "\n", + "The purpose of this example is to demonstrate two concepts:\n", + "\n", + "- Obtaining structured output from the agent\n", + "- Validating the output against the prompt that was given to the agent\n", + "\n", + "This should give a good understanding of output handler semantics and prepare you to build your own ones." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "from dotenv import load_dotenv\n", + "load_dotenv()\n", + "\n", + "from motleycrew import MotleyCrew\n", + "from motleycrew.agents.langchain.tool_calling_react import ReActToolCallingAgent\n", + "from motleycrew.tasks import SimpleTask\n", + "from motleycrew.common.exceptions import InvalidOutput\n", + "\n", + "from motleycrew.tools import PythonREPLTool\n", + "from motleycrew.tools import LLMTool\n", + "from motleycrew.agents import MotleyOutputHandler" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the [validating agent output](validating_agent_output.html) example, we used a simple tool to receive and check the output.\n", + "\n", + "Here we want to have access to the agent input, so we subclass a special `MotleyOutputHandler` class and implement its `handle_output` method. Inside it, we'll call an LLM critic that will assess the agent's output." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "class CoderOutputHandler(MotleyOutputHandler):\n", + " def handle_output(self, code: str, comment: str):\n", + " agent_prompt = self.agent_input[\"prompt\"]\n", + " critic = LLMTool(\n", + " name=\"critic\",\n", + " description=\"improve the agent output\",\n", + " prompt=(\n", + " \"An AI agent was given this prompt: \\n{agent_prompt}\\n\\n\"\n", + " \"Here's the code it generated: \\n```\\n{code}\\n```\\n\\n\"\n", + " \"And the comment it gave: \\n{comment}\\n\\n\"\n", + " \"The code and the comment must be sensible and easily understandable. \"\n", + " \"Give constructive advice on how to improve them unless they are already perfect. \"\n", + " \"In this case, just return a single \\\"OK\\\" without quotes.\"\n", + " )\n", + " )\n", + "\n", + " critic_message = critic.invoke({\"agent_prompt\": agent_prompt, \"code\": code, \"comment\": comment}).content\n", + " print(\"Critic response: \", critic_message)\n", + "\n", + " if critic_message.strip().lower().startswith(\"ok\"):\n", + " return code, comment\n", + " raise InvalidOutput(critic_message) # This will be returned to the agent to make it improve its output" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In case the critic is not satisfied, the `InvalidOutput` exception will be returned to the agent, as if a regular tool was called. If everything is OK, we just return the agent's output.\n", + "\n", + "The output schema is inferred automatically from the `handle_output` method signature. You can also specify it directly using the `_args_schema` property (see below)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024-06-20 22:44:33,253 - motleycrew - WARNING - No known Cypher type matching annotation typing.Optional[typing.Any], will use JSON string\n", + "2024-06-20 22:44:33,258 - motleycrew - WARNING - No known Cypher type matching annotation typing.List[str], will use JSON string\n" + ] + } + ], + "source": [ + "crew = MotleyCrew()\n", + "\n", + "coder = ReActToolCallingAgent(\n", + " name=\"coder\",\n", + " tools=[PythonREPLTool()],\n", + " output_handler=CoderOutputHandler(),\n", + " verbose=True,\n", + ")\n", + "\n", + "task = SimpleTask(\n", + " crew=crew,\n", + " name=\"Explain and implement the bubble sorting algorithm\",\n", + " description=\"Write a Python program that implements the bubble sorting algorithm \"\n", + " \"and test your implementation using the REPL tool. \\n\"\n", + " \"Return the code and a comment explaining how it works.\",\n", + " agent=coder,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024-06-20 22:44:33,289 - motleycrew - WARNING - No known Cypher type matching annotation typing.List[str], will use JSON string\n", + "WARNING:langchain_core.tracers.base:Parent run ac93a7c1-3864-4ffe-9a42-22823bd30134 not found for run a5178a15-aee9-4e14-a0f7-4d3dca6e310c. Treating as a root run.\n", + "WARNING:langchain_experimental.utilities.python:Python REPL can execute arbitrary code. Use with caution.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n", + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `python_repl` with `{'command': 'def bubble_sort(arr):\\n n = len(arr)\\n for i in range(n):\\n for j in range(0, n-i-1):\\n if arr[j] > arr[j+1]:\\n arr[j], arr[j+1] = arr[j+1], arr[j]\\n return arr\\n\\n# Test the bubble sort function\\nsample_array = [64, 34, 25, 12, 22, 11, 90]\\nsorted_array = bubble_sort(sample_array)\\nprint(sorted_array)'}`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m[11, 12, 22, 25, 34, 64, 90]\n", + "\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `output_handler` with `{'code': 'def bubble_sort(arr):\\n n = len(arr)\\n for i in range(n):\\n for j in range(0, n-i-1):\\n if arr[j] > arr[j+1]:\\n arr[j], arr[j+1] = arr[j+1], arr[j]\\n return arr\\n\\n# Test the bubble sort function\\nsample_array = [64, 34, 25, 12, 22, 11, 90]\\nsorted_array = bubble_sort(sample_array)\\nprint(sorted_array)', 'comment': 'The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array, and the inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. The test case demonstrates the function by sorting a sample array.'}`\n", + "\n", + "\n", + "\u001b[0mCritic response: The code and the comment are generally well-written and understandable. However, there are a few improvements that can be made for clarity and efficiency:\n", + "\n", + "1. **Optimization with a Flag**: The bubble sort algorithm can be optimized by adding a flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early.\n", + "\n", + "2. **Comment Enhancement**: The comment can be expanded to explain the optimization and the purpose of the flag.\n", + "\n", + "Here is the improved code and comment:\n", + "\n", + "```python\n", + "def bubble_sort(arr):\n", + " n = len(arr)\n", + " for i in range(n):\n", + " swapped = False\n", + " for j in range(0, n-i-1):\n", + " if arr[j] > arr[j+1]:\n", + " arr[j], arr[j+1] = arr[j+1], arr[j]\n", + " swapped = True\n", + " if not swapped:\n", + " break\n", + " return arr\n", + "\n", + "# Test the bubble sort function\n", + "sample_array = [64, 34, 25, 12, 22, 11, 90]\n", + "sorted_array = bubble_sort(sample_array)\n", + "print(sorted_array)\n", + "```\n", + "\n", + "Comment:\n", + "The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The test case demonstrates the function by sorting a sample array.\n", + "\u001b[33;1m\u001b[1;3mInvalidOutput: The code and the comment are generally well-written and understandable. However, there are a few improvements that can be made for clarity and efficiency:\n", + "\n", + "1. **Optimization with a Flag**: The bubble sort algorithm can be optimized by adding a flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early.\n", + "\n", + "2. **Comment Enhancement**: The comment can be expanded to explain the optimization and the purpose of the flag.\n", + "\n", + "Here is the improved code and comment:\n", + "\n", + "```python\n", + "def bubble_sort(arr):\n", + " n = len(arr)\n", + " for i in range(n):\n", + " swapped = False\n", + " for j in range(0, n-i-1):\n", + " if arr[j] > arr[j+1]:\n", + " arr[j], arr[j+1] = arr[j+1], arr[j]\n", + " swapped = True\n", + " if not swapped:\n", + " break\n", + " return arr\n", + "\n", + "# Test the bubble sort function\n", + "sample_array = [64, 34, 25, 12, 22, 11, 90]\n", + "sorted_array = bubble_sort(sample_array)\n", + "print(sorted_array)\n", + "```\n", + "\n", + "Comment:\n", + "The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The test case demonstrates the function by sorting a sample array.\u001b[0m\u001b[32;1m\u001b[1;3m\n", + "Invoking: `output_handler` with `{'code': 'def bubble_sort(arr):\\n n = len(arr)\\n for i in range(n):\\n swapped = False\\n for j in range(0, n-i-1):\\n if arr[j] > arr[j+1]:\\n arr[j], arr[j+1] = arr[j+1], arr[j]\\n swapped = True\\n if not swapped:\\n break\\n return arr\\n\\n# Test the bubble sort function\\nsample_array = [64, 34, 25, 12, 22, 11, 90]\\nsorted_array = bubble_sort(sample_array)\\nprint(sorted_array)', 'comment': 'The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The test case demonstrates the function by sorting a sample array.'}`\n", + "\n", + "\n", + "\u001b[0mCritic response: The code and the comment are well-written and clear. However, there are a few minor improvements that can be made for better readability and efficiency:\n", + "\n", + "1. **Code Improvement:**\n", + " - Use more descriptive variable names.\n", + " - Add type hints for better clarity.\n", + " - Separate the test code from the function definition for better modularity.\n", + "\n", + "2. **Comment Improvement:**\n", + " - Add a brief explanation of the time complexity of the algorithm.\n", + " - Mention that the algorithm is not stable and not suitable for large datasets due to its O(n^2) time complexity.\n", + "\n", + "Here is the improved version:\n", + "\n", + "```python\n", + "from typing import List\n", + "\n", + "def bubble_sort(arr: List[int]) -> List[int]:\n", + " n = len(arr)\n", + " for i in range(n):\n", + " swapped = False\n", + " for j in range(0, n - i - 1):\n", + " if arr[j] > arr[j + 1]:\n", + " arr[j], arr[j + 1] = arr[j + 1], arr[j]\n", + " swapped = True\n", + " if not swapped:\n", + " break\n", + " return arr\n", + "\n", + "# Test the bubble sort function\n", + "if __name__ == \"__main__\":\n", + " sample_array = [64, 34, 25, 12, 22, 11, 90]\n", + " sorted_array = bubble_sort(sample_array)\n", + " print(sorted_array)\n", + "```\n", + "\n", + "**Comment:**\n", + "The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements, and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The time complexity of bubble sort is O(n^2) in the worst and average cases, making it inefficient for large datasets. The test case demonstrates the function by sorting a sample array.\n", + "\n", + "These improvements make the code more readable and the comment more informative.\n", + "\u001b[33;1m\u001b[1;3mInvalidOutput: The code and the comment are well-written and clear. However, there are a few minor improvements that can be made for better readability and efficiency:\n", + "\n", + "1. **Code Improvement:**\n", + " - Use more descriptive variable names.\n", + " - Add type hints for better clarity.\n", + " - Separate the test code from the function definition for better modularity.\n", + "\n", + "2. **Comment Improvement:**\n", + " - Add a brief explanation of the time complexity of the algorithm.\n", + " - Mention that the algorithm is not stable and not suitable for large datasets due to its O(n^2) time complexity.\n", + "\n", + "Here is the improved version:\n", + "\n", + "```python\n", + "from typing import List\n", + "\n", + "def bubble_sort(arr: List[int]) -> List[int]:\n", + " n = len(arr)\n", + " for i in range(n):\n", + " swapped = False\n", + " for j in range(0, n - i - 1):\n", + " if arr[j] > arr[j + 1]:\n", + " arr[j], arr[j + 1] = arr[j + 1], arr[j]\n", + " swapped = True\n", + " if not swapped:\n", + " break\n", + " return arr\n", + "\n", + "# Test the bubble sort function\n", + "if __name__ == \"__main__\":\n", + " sample_array = [64, 34, 25, 12, 22, 11, 90]\n", + " sorted_array = bubble_sort(sample_array)\n", + " print(sorted_array)\n", + "```\n", + "\n", + "**Comment:**\n", + "The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements, and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The time complexity of bubble sort is O(n^2) in the worst and average cases, making it inefficient for large datasets. The test case demonstrates the function by sorting a sample array.\n", + "\n", + "These improvements make the code more readable and the comment more informative.\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024-06-20 22:44:34,060 - motleycrew - WARNING - No known Cypher type matching annotation typing.Optional[typing.Any], will use JSON string\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `output_handler` with `{'code': 'from typing import List\\n\\ndef bubble_sort(arr: List[int]) -> List[int]:\\n n = len(arr)\\n for i in range(n):\\n swapped = False\\n for j in range(0, n - i - 1):\\n if arr[j] > arr[j + 1]:\\n arr[j], arr[j + 1] = arr[j + 1], arr[j]\\n swapped = True\\n if not swapped:\\n break\\n return arr\\n\\n# Test the bubble sort function\\nif __name__ == \"__main__\":\\n sample_array = [64, 34, 25, 12, 22, 11, 90]\\n sorted_array = bubble_sort(sample_array)\\n print(sorted_array)', 'comment': 'The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements, and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The time complexity of bubble sort is O(n^2) in the worst and average cases, making it inefficient for large datasets. The test case demonstrates the function by sorting a sample array.'}`\n", + "\n", + "\n", + "\u001b[0mCritic response: OK\n" + ] + } + ], + "source": [ + "crew.run()\n", + "code, comment = task.output" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "See, the initial code was optimized thanks to the critic!" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
from typing import List\n",
+       "\n",
+       "def bubble_sort(arr: List[int]) -> List[int]:\n",
+       "    n = len(arr)\n",
+       "    for i in range(n):\n",
+       "        swapped = False\n",
+       "        for j in range(0, n - i - 1):\n",
+       "            if arr[j] > arr[j + 1]:\n",
+       "                arr[j], arr[j + 1] = arr[j + 1], arr[j]\n",
+       "                swapped = True\n",
+       "        if not swapped:\n",
+       "            break\n",
+       "    return arr\n",
+       "\n",
+       "# Test the bubble sort function\n",
+       "if __name__ == "__main__":\n",
+       "    sample_array = [64, 34, 25, 12, 22, 11, 90]\n",
+       "    sorted_array = bubble_sort(sample_array)\n",
+       "    print(sorted_array)\n",
+       "
\n" + ], + "text/latex": [ + "\\begin{Verbatim}[commandchars=\\\\\\{\\}]\n", + "\\PY{k+kn}{from} \\PY{n+nn}{typing} \\PY{k+kn}{import} \\PY{n}{List}\n", + "\n", + "\\PY{k}{def} \\PY{n+nf}{bubble\\PYZus{}sort}\\PY{p}{(}\\PY{n}{arr}\\PY{p}{:} \\PY{n}{List}\\PY{p}{[}\\PY{n+nb}{int}\\PY{p}{]}\\PY{p}{)} \\PY{o}{\\PYZhy{}}\\PY{o}{\\PYZgt{}} \\PY{n}{List}\\PY{p}{[}\\PY{n+nb}{int}\\PY{p}{]}\\PY{p}{:}\n", + " \\PY{n}{n} \\PY{o}{=} \\PY{n+nb}{len}\\PY{p}{(}\\PY{n}{arr}\\PY{p}{)}\n", + " \\PY{k}{for} \\PY{n}{i} \\PY{o+ow}{in} \\PY{n+nb}{range}\\PY{p}{(}\\PY{n}{n}\\PY{p}{)}\\PY{p}{:}\n", + " \\PY{n}{swapped} \\PY{o}{=} \\PY{k+kc}{False}\n", + " \\PY{k}{for} \\PY{n}{j} \\PY{o+ow}{in} \\PY{n+nb}{range}\\PY{p}{(}\\PY{l+m+mi}{0}\\PY{p}{,} \\PY{n}{n} \\PY{o}{\\PYZhy{}} \\PY{n}{i} \\PY{o}{\\PYZhy{}} \\PY{l+m+mi}{1}\\PY{p}{)}\\PY{p}{:}\n", + " \\PY{k}{if} \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j}\\PY{p}{]} \\PY{o}{\\PYZgt{}} \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j} \\PY{o}{+} \\PY{l+m+mi}{1}\\PY{p}{]}\\PY{p}{:}\n", + " \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j}\\PY{p}{]}\\PY{p}{,} \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j} \\PY{o}{+} \\PY{l+m+mi}{1}\\PY{p}{]} \\PY{o}{=} \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j} \\PY{o}{+} \\PY{l+m+mi}{1}\\PY{p}{]}\\PY{p}{,} \\PY{n}{arr}\\PY{p}{[}\\PY{n}{j}\\PY{p}{]}\n", + " \\PY{n}{swapped} \\PY{o}{=} \\PY{k+kc}{True}\n", + " \\PY{k}{if} \\PY{o+ow}{not} \\PY{n}{swapped}\\PY{p}{:}\n", + " \\PY{k}{break}\n", + " \\PY{k}{return} \\PY{n}{arr}\n", + "\n", + "\\PY{c+c1}{\\PYZsh{} Test the bubble sort function}\n", + "\\PY{k}{if} \\PY{n+nv+vm}{\\PYZus{}\\PYZus{}name\\PYZus{}\\PYZus{}} \\PY{o}{==} \\PY{l+s+s2}{\\PYZdq{}}\\PY{l+s+s2}{\\PYZus{}\\PYZus{}main\\PYZus{}\\PYZus{}}\\PY{l+s+s2}{\\PYZdq{}}\\PY{p}{:}\n", + " \\PY{n}{sample\\PYZus{}array} \\PY{o}{=} \\PY{p}{[}\\PY{l+m+mi}{64}\\PY{p}{,} \\PY{l+m+mi}{34}\\PY{p}{,} \\PY{l+m+mi}{25}\\PY{p}{,} \\PY{l+m+mi}{12}\\PY{p}{,} \\PY{l+m+mi}{22}\\PY{p}{,} \\PY{l+m+mi}{11}\\PY{p}{,} \\PY{l+m+mi}{90}\\PY{p}{]}\n", + " \\PY{n}{sorted\\PYZus{}array} \\PY{o}{=} \\PY{n}{bubble\\PYZus{}sort}\\PY{p}{(}\\PY{n}{sample\\PYZus{}array}\\PY{p}{)}\n", + " \\PY{n+nb}{print}\\PY{p}{(}\\PY{n}{sorted\\PYZus{}array}\\PY{p}{)}\n", + "\\end{Verbatim}\n" + ], + "text/plain": [ + "from typing import List\n", + "\n", + "def bubble_sort(arr: List[int]) -> List[int]:\n", + " n = len(arr)\n", + " for i in range(n):\n", + " swapped = False\n", + " for j in range(0, n - i - 1):\n", + " if arr[j] > arr[j + 1]:\n", + " arr[j], arr[j + 1] = arr[j + 1], arr[j]\n", + " swapped = True\n", + " if not swapped:\n", + " break\n", + " return arr\n", + "\n", + "# Test the bubble sort function\n", + "if __name__ == \"__main__\":\n", + " sample_array = [64, 34, 25, 12, 22, 11, 90]\n", + " sorted_array = bubble_sort(sample_array)\n", + " print(sorted_array)" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from IPython.display import display, Markdown, Code\n", + "\n", + "display(Code(code, language=\"python\"))" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "The `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements, and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array. The inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added with a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The time complexity of bubble sort is O(n^2) in the worst and average cases, making it inefficient for large datasets. The test case demonstrates the function by sorting a sample array." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "display(Markdown(comment))" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "final_result = f\"{code}\\n\\n{comment}\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Customizing the output handler\n", + "\n", + "When subclassing `MotleyOutputHandler`, you can override the default name, description and args schema. You can also specify which exceptions will be caught and returned to the agent if raised inside the output handler." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [], + "source": [ + "from langchain_core.pydantic_v1 import BaseModel, Field\n", + "\n", + "class CustomOutputHandlerSchema(BaseModel):\n", + " code = Field(str, description=\"Python code demonstrating bubble sort\")\n", + " comment = Field(str, description=\"Comment explaining how the code works\")\n", + "\n", + "\n", + "class MyCustomOutputHandler(MotleyOutputHandler):\n", + " _name = \"Output handler\"\n", + " _description = \"Output handler. ONLY RETURN THE FINAL RESULT USING THIS TOOL!\"\n", + " _args_schema = CustomOutputHandlerSchema\n", + " _exceptions_to_handle = (InvalidOutput, ValueError) # ValueErrors will also be returned to the agent\n", + "\n", + " def handle_output(self, code: str, comment: str):\n", + " ..." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".venv", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.2" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/examples/Basic introduction.ipynb b/examples/Basic introduction.ipynb deleted file mode 100644 index c16da8f7..00000000 --- a/examples/Basic introduction.ipynb +++ /dev/null @@ -1,102 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "87b73640", - "metadata": {}, - "source": [ - "# Basic introduction" - ] - }, - { - "cell_type": "markdown", - "id": "5877df83-f5d8-447e-a200-ec249268210b", - "metadata": {}, - "source": [ - "This is an introduction to the simple version of our API, which allows you to mix and match tools and agents from different frameworks and pass agents as tools to other agents.\n", - "\n", - "After reading this, you might want to continue to the [introduction to the full API](?) to see how you can implement more flexible and powerful dispatch via dynamic knowledge graphs.\n", - "\n", - "Also worth reading is the [introduction to caching and tracing](?)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2596164c", - "metadata": {}, - "outputs": [], - "source": [ - "from pathlib import Path\n", - "import shutil\n", - "import os, sys\n", - "import platform\n", - "\n", - "import kuzu\n", - "from dotenv import load_dotenv\n", - "\n", - "# This assumes you have a .env file in the examples folder, containing your OpenAI key\n", - "load_dotenv()\n", - "\n", - "WORKING_DIR = Path(os.path.realpath(\".\"))\n", - "\n", - "try: \n", - " from motleycrew import MotleyCrew\n", - "except ImportError:\n", - " # if we are running this from source\n", - " motleycrew_location = os.path.realpath(WORKING_DIR / \"..\")\n", - " sys.path.append(motleycrew_location)\n", - "\n", - "if \"Dropbox\" in WORKING_DIR.parts and platform.system() == \"Windows\":\n", - " # On Windows, kuzu has file locking issues with Dropbox\n", - " DB_PATH = os.path.realpath(os.path.expanduser(\"~\") + \"/Documents/research_db\")\n", - "else:\n", - " DB_PATH = os.path.realpath(WORKING_DIR / \"research_db\")\n", - "\n", - "shutil.rmtree(DB_PATH)\n", - "\n", - "from motleycrew import MotleyCrew\n", - "from motleycrew.storage import MotleyKuzuGraphStore\n", - "from motleycrew.common.utils import configure_logging\n", - "from motleycrew.applications.research_agent.question_task_recipe import QuestionTaskRecipe\n", - "from motleycrew.applications.research_agent.answer_task_recipe import AnswerTaskRecipe\n", - "from motleycrew.tool.simple_retriever_tool import SimpleRetrieverTool\n", - "\n", - "configure_logging(verbose=True)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "03f5fb3d-5c75-441b-9f60-0c4a17974c79", - "metadata": {}, - "outputs": [], - "source": [ - "db = kuzu.Database(DB_PATH)\n", - "graph_store = MotleyKuzuGraphStore(db)\n", - "crew = MotleyCrew(graph_store=graph_store)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.5" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/examples/Multi-step research agent.ipynb b/examples/Multi-step research agent.ipynb index 4e982400..e572dc6d 100644 --- a/examples/Multi-step research agent.ipynb +++ b/examples/Multi-step research agent.ipynb @@ -622,7 +622,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.10" + "version": "3.12.2" } }, "nbformat": 4, diff --git a/examples/Quickstart.ipynb b/examples/Quickstart.ipynb index 57da008f..d97db356 100644 --- a/examples/Quickstart.ipynb +++ b/examples/Quickstart.ipynb @@ -32,7 +32,14 @@ "\n", "Motleycrew also supports **delegation**: you can simply give any agent as a tool to any other agent. \n", "\n", - "All the wrappers for tools and agents implement the Runnable interface, so you can use them as-is in LCEL and Langgraph code." + "All the wrappers for tools and agents implement the Runnable interface, so you can use them as-is in LCEL and Langgraph code.\n", + "\n", + "\n", + "### Output handlers\n", + "\n", + "An **output handler** is a special tool that the agent uses for submitting the final output instead of returning it raw. Besides defining a schema for the output, the output handler enables you to implement any validation logic inside it, including agent-based. If your agent returns invalid output, you can raise an exception that will be returned to the agent so that it can retry.\n", + "\n", + "See our usage examples with a [simple validator](examples/validating_agent_output.html) and an [advanced output handler with multiple fields](examples/advanced_output_handling.html).\n" ] }, { diff --git a/examples/Output handlers for validating agent output.ipynb b/examples/Validating agent output.ipynb similarity index 99% rename from examples/Output handlers for validating agent output.ipynb rename to examples/Validating agent output.ipynb index aa41342e..bcf23f68 100644 --- a/examples/Output handlers for validating agent output.ipynb +++ b/examples/Validating agent output.ipynb @@ -237,7 +237,9 @@ ], "source": [ "from IPython.display import display, Markdown\n", - "display(Markdown(task.output[\"checked_output\"]))" + "\n", + "final_result = task.output[\"checked_output\"]\n", + "display(Markdown(final_result))" ] } ], diff --git a/motleycrew/agents/output_handler.py b/motleycrew/agents/output_handler.py index 702459c3..3a1d5c03 100644 --- a/motleycrew/agents/output_handler.py +++ b/motleycrew/agents/output_handler.py @@ -1,6 +1,6 @@ from typing import Optional from abc import ABC, abstractmethod -from langchain_core.tools import StructuredTool, BaseTool +from langchain_core.tools import StructuredTool from langchain_core.pydantic_v1 import BaseModel from motleycrew.agents.abstract_parent import MotleyAgentAbstractParent diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/07720a3f1c589096bd9c165049cdd12711def9925eefe6d7163d44e2c052b2bf.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/07720a3f1c589096bd9c165049cdd12711def9925eefe6d7163d44e2c052b2bf.pkl new file mode 100644 index 00000000..aac74743 Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/07720a3f1c589096bd9c165049cdd12711def9925eefe6d7163d44e2c052b2bf.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/0e875e7397179704e9c0f59a201287a3d0c56d7b1a692ad20f775c55b97a94f6.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/0e875e7397179704e9c0f59a201287a3d0c56d7b1a692ad20f775c55b97a94f6.pkl new file mode 100644 index 00000000..84b065b9 Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/0e875e7397179704e9c0f59a201287a3d0c56d7b1a692ad20f775c55b97a94f6.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/142f514247583600c43a67e4328369554bc34f77f60fed42015d2bf2aa25bfa8.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/142f514247583600c43a67e4328369554bc34f77f60fed42015d2bf2aa25bfa8.pkl new file mode 100644 index 00000000..d033d90a Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/142f514247583600c43a67e4328369554bc34f77f60fed42015d2bf2aa25bfa8.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/194738103260685fcfe14b080193c65baceb5fd00bb449694913f6f0762b0291.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/194738103260685fcfe14b080193c65baceb5fd00bb449694913f6f0762b0291.pkl new file mode 100644 index 00000000..4d22d39b Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/194738103260685fcfe14b080193c65baceb5fd00bb449694913f6f0762b0291.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/38c587c596ba4a7320df077e92bb9a303734251dd6391f31a0b66c2dbeb0ce44.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/38c587c596ba4a7320df077e92bb9a303734251dd6391f31a0b66c2dbeb0ce44.pkl new file mode 100644 index 00000000..8dc4e201 Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/38c587c596ba4a7320df077e92bb9a303734251dd6391f31a0b66c2dbeb0ce44.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/49aefe482900e5529f22f21ffea3a217532d894d30ad60d5b902317f841507c7.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/49aefe482900e5529f22f21ffea3a217532d894d30ad60d5b902317f841507c7.pkl new file mode 100644 index 00000000..4dd4367d Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/49aefe482900e5529f22f21ffea3a217532d894d30ad60d5b902317f841507c7.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/a0cc27bc02e7df8d60981c2aacf3d49dbf534a6f999d02431db8e8de8145e52b.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/a0cc27bc02e7df8d60981c2aacf3d49dbf534a6f999d02431db8e8de8145e52b.pkl new file mode 100644 index 00000000..da42c0c3 Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/a0cc27bc02e7df8d60981c2aacf3d49dbf534a6f999d02431db8e8de8145e52b.pkl differ diff --git a/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/b3c179a00e76ca4c0fcf3b0d09bd2be9dc164849323b77cb39acc6f39e9ebae2.pkl b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/b3c179a00e76ca4c0fcf3b0d09bd2be9dc164849323b77cb39acc6f39e9ebae2.pkl new file mode 100644 index 00000000..9d533136 Binary files /dev/null and b/tests/itest_cache/advanced_output_handling_ipynb/api.openai.com/v1_chat_completions/b3c179a00e76ca4c0fcf3b0d09bd2be9dc164849323b77cb39acc6f39e9ebae2.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/0b184b8eb9485c32b27669282f0c49c9168710c777bde70af36f9719b78f6494.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/0b184b8eb9485c32b27669282f0c49c9168710c777bde70af36f9719b78f6494.pkl new file mode 100644 index 00000000..c8e23c25 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/0b184b8eb9485c32b27669282f0c49c9168710c777bde70af36f9719b78f6494.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/160405dad77d589a5c94a19b71e3ae1639b94a7574e1854346bf5040589365be.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/160405dad77d589a5c94a19b71e3ae1639b94a7574e1854346bf5040589365be.pkl new file mode 100644 index 00000000..f4a30601 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/160405dad77d589a5c94a19b71e3ae1639b94a7574e1854346bf5040589365be.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/490bebac9bd6714c35a1823de29cf7152ce16bd086b7b07bad52d87998206da6.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/490bebac9bd6714c35a1823de29cf7152ce16bd086b7b07bad52d87998206da6.pkl new file mode 100644 index 00000000..7e4537be Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/490bebac9bd6714c35a1823de29cf7152ce16bd086b7b07bad52d87998206da6.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/77954563ee361cf29831df15f0febc9796d48310e6898400e92bf3bc3fed102a.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/77954563ee361cf29831df15f0febc9796d48310e6898400e92bf3bc3fed102a.pkl new file mode 100644 index 00000000..2daf3fdb Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/77954563ee361cf29831df15f0febc9796d48310e6898400e92bf3bc3fed102a.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/7b152b7e9247ae01ac33a10c5eec4dbaaa1d33e9900cc3edd11bb677cb4dceb5.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/7b152b7e9247ae01ac33a10c5eec4dbaaa1d33e9900cc3edd11bb677cb4dceb5.pkl new file mode 100644 index 00000000..2208e6c2 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/7b152b7e9247ae01ac33a10c5eec4dbaaa1d33e9900cc3edd11bb677cb4dceb5.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/928d25f2f66e3149ca9acc9271017c664d1fe036ae6d33c1c82bad569c3ebdcf.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/928d25f2f66e3149ca9acc9271017c664d1fe036ae6d33c1c82bad569c3ebdcf.pkl new file mode 100644 index 00000000..c2ec64bb Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/928d25f2f66e3149ca9acc9271017c664d1fe036ae6d33c1c82bad569c3ebdcf.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/96e1448836b6a65774b502277d21fe4832ba1ee9bf7e755689720e15927812a9.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/96e1448836b6a65774b502277d21fe4832ba1ee9bf7e755689720e15927812a9.pkl new file mode 100644 index 00000000..e2b9c718 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/96e1448836b6a65774b502277d21fe4832ba1ee9bf7e755689720e15927812a9.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/d78a0f44d55f371cf63687217262328b9cd4a68c8e8f8ae3bb296a66f9424475.pkl b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/d78a0f44d55f371cf63687217262328b9cd4a68c8e8f8ae3bb296a66f9424475.pkl new file mode 100644 index 00000000..3a94c1aa Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/api.openai.com/v1_chat_completions/d78a0f44d55f371cf63687217262328b9cd4a68c8e8f8ae3bb296a66f9424475.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/11083a1f92e1a22d74549199b7f1fbc656d7dea57dccb10e67d4857cb7dea29f.pkl b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/11083a1f92e1a22d74549199b7f1fbc656d7dea57dccb10e67d4857cb7dea29f.pkl new file mode 100644 index 00000000..37920e25 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/11083a1f92e1a22d74549199b7f1fbc656d7dea57dccb10e67d4857cb7dea29f.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/1d103b47344b8f5d969c41fb7e46c692907ccd6841b8f1e59f2a17178410a496.pkl b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/1d103b47344b8f5d969c41fb7e46c692907ccd6841b8f1e59f2a17178410a496.pkl new file mode 100644 index 00000000..6fc687ed Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/1d103b47344b8f5d969c41fb7e46c692907ccd6841b8f1e59f2a17178410a496.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/6ff222918b0d91ac3a3b5523aa8d295dbabff68bd1fa22e3bef8d8ce87d2e60f.pkl b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/6ff222918b0d91ac3a3b5523aa8d295dbabff68bd1fa22e3bef8d8ce87d2e60f.pkl new file mode 100644 index 00000000..3d5b5534 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/6ff222918b0d91ac3a3b5523aa8d295dbabff68bd1fa22e3bef8d8ce87d2e60f.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/bf44f0300d6d9bb90b2f7c185a81db67ace63db61eb0da2b76e54417d1f8cc26.pkl b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/bf44f0300d6d9bb90b2f7c185a81db67ace63db61eb0da2b76e54417d1f8cc26.pkl new file mode 100644 index 00000000..5f368bf1 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/bf44f0300d6d9bb90b2f7c185a81db67ace63db61eb0da2b76e54417d1f8cc26.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/c0922709ed6277dfa4b469c1de290d1f1c8a41e45ca98b9495e8dbfa15344f9a.pkl b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/c0922709ed6277dfa4b469c1de290d1f1c8a41e45ca98b9495e8dbfa15344f9a.pkl new file mode 100644 index 00000000..b85ff971 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/duckduckgo.com/c0922709ed6277dfa4b469c1de290d1f1c8a41e45ca98b9495e8dbfa15344f9a.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/191cd93437a6a2800383c4c249c4cd30d273ba0fde667ca3ebd129dd2e8091b3.pkl b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/191cd93437a6a2800383c4c249c4cd30d273ba0fde667ca3ebd129dd2e8091b3.pkl new file mode 100644 index 00000000..af71172e Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/191cd93437a6a2800383c4c249c4cd30d273ba0fde667ca3ebd129dd2e8091b3.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/1e216e4a265eb5bbe7b6d197c88d57e7cdf38a93be78081e62bd4c55d6b1e54f.pkl b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/1e216e4a265eb5bbe7b6d197c88d57e7cdf38a93be78081e62bd4c55d6b1e54f.pkl new file mode 100644 index 00000000..2cf59898 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/1e216e4a265eb5bbe7b6d197c88d57e7cdf38a93be78081e62bd4c55d6b1e54f.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/60b44ba354bee0d3610964f2c15bfc627af0d4f4b2da1026321b3e89991b7cf7.pkl b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/60b44ba354bee0d3610964f2c15bfc627af0d4f4b2da1026321b3e89991b7cf7.pkl new file mode 100644 index 00000000..d7282bb0 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/60b44ba354bee0d3610964f2c15bfc627af0d4f4b2da1026321b3e89991b7cf7.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/c9ee5858fe29e182787d456dde61fb54c70693c6207774e9b17885e15b2e00bd.pkl b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/c9ee5858fe29e182787d456dde61fb54c70693c6207774e9b17885e15b2e00bd.pkl new file mode 100644 index 00000000..f9cdf3ac Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/c9ee5858fe29e182787d456dde61fb54c70693c6207774e9b17885e15b2e00bd.pkl differ diff --git a/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/f40c543dc685bd2c7b7a1cf2a7810d930418d2ce1cd86fa543c295cc90f1de19.pkl b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/f40c543dc685bd2c7b7a1cf2a7810d930418d2ce1cd86fa543c295cc90f1de19.pkl new file mode 100644 index 00000000..23c89ad9 Binary files /dev/null and b/tests/itest_cache/validating_agent_output_ipynb/links.duckduckgo.com/d.js/f40c543dc685bd2c7b7a1cf2a7810d930418d2ce1cd86fa543c295cc90f1de19.pkl differ diff --git a/tests/itest_golden_data/advanced_output_handling_ipynb.json b/tests/itest_golden_data/advanced_output_handling_ipynb.json new file mode 100644 index 00000000..5a9b75f6 --- /dev/null +++ b/tests/itest_golden_data/advanced_output_handling_ipynb.json @@ -0,0 +1 @@ +"def bubble_sort(arr):\n n = len(arr)\n for i in range(n):\n swapped = False\n for j in range(0, n-i-1):\n if arr[j] > arr[j+1]:\n arr[j], arr[j+1] = arr[j+1], arr[j]\n swapped = True\n if not swapped:\n break\n return arr\n\n# Test the bubble sort function\nsample_array = [64, 34, 25, 12, 22, 11, 90]\nsorted_array = bubble_sort(sample_array)\nprint(sorted_array)\n\nThe `bubble_sort` function sorts an array using the bubble sort algorithm. It works by repeatedly stepping through the list, comparing adjacent elements and swapping them if they are in the wrong order. This process is repeated until the list is sorted. The outer loop runs `n` times, where `n` is the length of the array, and the inner loop runs `n-i-1` times to avoid re-checking the already sorted elements. An optimization is added by using a `swapped` flag to detect if any swaps were made during an iteration. If no swaps were made, the array is already sorted, and the algorithm can terminate early. The test case demonstrates the function by sorting a sample array." \ No newline at end of file diff --git a/tests/itest_golden_data/validating_agent_output_ipynb.json b/tests/itest_golden_data/validating_agent_output_ipynb.json new file mode 100644 index 00000000..19031866 --- /dev/null +++ b/tests/itest_golden_data/validating_agent_output_ipynb.json @@ -0,0 +1 @@ +"## Comprehensive Analysis of AI Advancements in 2024\n\n### Key Trends in AI\n\n1. **Generative AI Adoption**:\n - Generative AI tools like ChatGPT have reached mass adoption, significantly impacting various industries. In 2024, organizations are not only adopting generative AI but also deriving substantial business value from it. This trend is expected to continue, with more people and businesses utilizing generative AI models for diverse applications.\n\n2. **Multimodal AI**:\n - Multimodal AI, which processes multiple input types such as text, images, and sound, is gaining traction. This approach mimics the human ability to process diverse sensory information and is expected to lead to more sophisticated AI applications.\n\n3. **Smaller and Open Source Models**:\n - There is a shift towards smaller language models and open-source advancements. These models are becoming more accessible, allowing for customized local models and data pipelines. This trend is democratizing AI, making it more available to smaller organizations and individual developers.\n\n4. **Model Optimization**:\n - Model optimization is becoming more accessible, addressing issues like GPU shortages and cloud costs. This trend is crucial for making AI more efficient and cost-effective.\n\n5. **Transfer Learning**:\n - Transfer learning, which uses preexisting models as a starting point for new tasks, is reducing the time and data needed to develop accurate models. This trend is expected to create new job opportunities focused on developing and maintaining these models.\n\n### Breakthrough Technologies\n\n1. **Advanced Generative AI Models**:\n - State-of-the-art generative AI models are becoming more useful for non-tech individuals. This democratization of AI technology is expected to lead to widespread experimentation and innovation.\n\n2. **Claude by Anthropic**:\n - Anthropic, founded by former OpenAI executives, has made significant strides with its large language model, Claude. This model is expected to compete with other leading AI technologies and drive further advancements in the field.\n\n3. **AI in Business Operations**:\n - AI is driving significant reductions in operational expenses, with businesses expecting a 37% reduction in costs in 2024. This is due to AI's ability to lower error rates and improve efficiency.\n\n### Potential Industry Impacts\n\n1. **Economic Transformation**:\n - 2024 is expected to be a pivotal year for AI, with generative AI triggering a metamorphosis across the global economic landscape. Businesses are waking up to the broad potential of AI, leading to transformative changes in various sectors.\n\n2. **Job Market Evolution**:\n - The rise of AI technologies is creating new job opportunities, particularly in developing, maintaining, and optimizing AI models. However, it may also lead to job displacement in certain sectors, necessitating a focus on reskilling and upskilling the workforce.\n\n3. **Environmental and Political Impacts**:\n - The adoption of AI is expected to have significant environmental and political impacts. For instance, AI can contribute to more efficient resource management and decision-making processes, but it also raises concerns about data privacy and ethical considerations.\n\n4. **Industry-Specific Applications**:\n - Different industries are adopting AI at varying rates. For example, sectors like healthcare, finance, and manufacturing are seeing substantial AI integration, leading to improved outcomes and efficiencies.\n\n### AI Applications in Medicine\n\n1. **Multimodal AI in Healthcare**:\n - AI is being used to assimilate information from diverse sources such as medical images, textual clinical reports, and voice data. This multimodal approach is driving new understandings about health and healthcare, leading to more accurate diagnoses and personalized treatment plans.\n\n2. **Improving Health Outcomes**:\n - AI technologies are transforming care delivery by improving health outcomes, patient safety, and the affordability and accessibility of high-quality care. AI is being used to predict patient outcomes, optimize treatment plans, and monitor patient health in real-time.\n\n3. **Complementary Tool in Healthcare**:\n - AI is increasingly seen as a complementary tool rather than a replacement in healthcare. It assists healthcare professionals in making more informed decisions, thereby enhancing the quality of care provided to patients.\n\n4. **Responsible Data Usage**:\n - The responsible usage of patient data is crucial in AI applications in medicine. Ensuring data privacy and security while leveraging AI for healthcare advancements is a key focus area.\n\n### Conclusion\n\nThe advancements in AI in 2024 are marked by significant trends, breakthrough technologies, and transformative industry impacts. Generative AI, multimodal AI, and model optimization are key trends driving the field forward. Breakthrough technologies like advanced generative models and Claude by Anthropic are setting new benchmarks. The industry impacts are profound, with economic transformation, job market evolution, and sector-specific applications leading the way. In healthcare, AI is revolutionizing care delivery, improving health outcomes, and enhancing patient safety. As AI continues to evolve, it is crucial for stakeholders to stay informed and adapt to these changes to harness the full potential of this revolutionary technology." \ No newline at end of file diff --git a/tests/run_integration_tests.py b/tests/run_integration_tests.py index 10aa17f7..853ed073 100644 --- a/tests/run_integration_tests.py +++ b/tests/run_integration_tests.py @@ -40,6 +40,8 @@ "blog_with_images_ipynb": "examples/Blog with Images.ipynb", "multi_step_research_agent_ipynb": "examples/Multi-step research agent.ipynb", "math_via_python_code_with_a_single_agent_ipynb": "examples/Math via python code with a single agent.ipynb", + "validating_agent_output_ipynb": "examples/Validating agent output.ipynb", + "advanced_output_handling_ipynb": "examples/Advanced output handling.ipynb", # "using_autogen_with_motleycrew_ipynb": "examples/Using AutoGen with motleycrew.ipynb" }