Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] TextGrad Vision #41

Merged
merged 16 commits into from
Jul 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,4 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
logs/
18 changes: 10 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,17 +103,19 @@ We have many more examples around how TextGrad can optimize all kinds of variabl

### Tutorials

We have prepared a couple of tutorials to get you started with TextGrad.
You can run them directly in Google Colab by clicking on the links below.
We have prepared a couple of tutorials to get you started with TextGrad. The order of this
tutorial is what we would recommend to follow for a beginner. You can run them directly in Google Colab by clicking on the links below (but
you need an OpenAI/Anthropic key to run the LLMs).

<div align="center">

| Example | Colab Link |
|-------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Introduction to TextGrad Primitives | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Primitives.ipynb) |
| Optimizing a Code Snippet and Define a New Loss | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/textgrad/blob/main/examples/notebooks/Tutorial-Test-Time-Loss-for-Code.ipynb) |
| Prompt Optimization | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Prompt-Optimization.ipynb) |
| Solution Optimization | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Tutorial-Solution-Optimization.ipynb) |
| Tutorial | Difficulty | Colab Link |
|----------------------------------------------------|-----------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1. Introduction to TextGrad Primitives | ![](https://img.shields.io/badge/Level-Beginner-green.svg) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Tutorial-Primitives.ipynb) |
| 2. Solution Optimization | ![](https://img.shields.io/badge/Level-Beginner-green.svg) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Tutorial-Solution-Optimization.ipynb) |
| 3. Optimizing a Code Snippet and Define a New Loss | ![](https://img.shields.io/badge/Level-Beginner-green.svg) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/textgrad/blob/main/examples/notebooks/Tutorial-Test-Time-Loss-for-Code.ipynb) |
| 4. Prompt Optimization | ![](https://img.shields.io/badge/Level-Intermediate-yellow.svg) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Tutorial-Prompt-Optimization.ipynb) |
| 5. MultiModal Optimization | ![](https://img.shields.io/badge/Level-Beginner-green.svg) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Tutorial-MultiModal.ipynb) |

</div>

Expand Down
6 changes: 3 additions & 3 deletions examples/notebooks/Local-Model-With-LMStudio.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "textgrad",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -196,9 +196,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.9"
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}
265 changes: 265 additions & 0 deletions examples/notebooks/Tutorial-MultiModal.ipynb

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,10 @@
"cell_type": "markdown",
"id": "8887fbed36c7daf2",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Introduction: Variable\n",
Expand All @@ -89,7 +92,10 @@
"end_time": "2024-06-11T15:43:17.669096228Z",
"start_time": "2024-06-11T15:43:17.665325560Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -105,7 +111,10 @@
"end_time": "2024-06-11T15:43:18.184004948Z",
"start_time": "2024-06-11T15:43:18.178187640Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
Expand All @@ -127,7 +136,10 @@
"cell_type": "markdown",
"id": "63f6a6921a1cce6a",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Introduction: Engine\n",
Expand All @@ -144,7 +156,10 @@
"end_time": "2024-06-11T15:44:32.606319032Z",
"start_time": "2024-06-11T15:44:32.561460448Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -155,7 +170,10 @@
"cell_type": "markdown",
"id": "33c7d6eaa115cd6a",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"This object behaves like you would expect an LLM to behave: You can sample generation from the engine using the `generate` method. "
Expand All @@ -170,7 +188,10 @@
"end_time": "2024-06-11T17:29:41.108552705Z",
"start_time": "2024-06-11T17:29:40.294256814Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
Expand All @@ -192,7 +213,10 @@
"cell_type": "markdown",
"id": "b627edc07c0d3737",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Introduction: Loss\n",
Expand All @@ -209,7 +233,10 @@
"end_time": "2024-06-11T15:44:32.894722136Z",
"start_time": "2024-06-11T15:44:32.890708561Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -221,15 +248,21 @@
"cell_type": "markdown",
"id": "ff137c99e0659dcc",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": []
},
{
"cell_type": "markdown",
"id": "6f05ec2bf907b3ba",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Introduction: Optimizer\n",
Expand All @@ -248,7 +281,10 @@
"end_time": "2024-06-11T15:44:33.741130951Z",
"start_time": "2024-06-11T15:44:33.734977769Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -259,7 +295,10 @@
"cell_type": "markdown",
"id": "d26883eb74ce0d01",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Putting it all together\n",
Expand All @@ -276,7 +315,10 @@
"end_time": "2024-06-11T15:44:41.730132530Z",
"start_time": "2024-06-11T15:44:34.997777872Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -294,7 +336,10 @@
"end_time": "2024-06-11T15:44:41.738985151Z",
"start_time": "2024-06-11T15:44:41.731989729Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [
{
Expand All @@ -316,7 +361,10 @@
"cell_type": "markdown",
"id": "6a8aab93b80fb82c",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"While here it is not going to be useful, we can also do multiple optimization steps in a loop! Do not forget to reset the gradients after each step!"
Expand All @@ -330,7 +378,10 @@
"ExecuteTime": {
"start_time": "2024-06-11T15:44:30.989940227Z"
},
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": [
Expand All @@ -342,7 +393,10 @@
"execution_count": null,
"id": "a3a84aad4cd58737",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"outputs": [],
"source": []
Expand Down
4 changes: 3 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,6 @@ platformdirs>=3.11.0
datasets>=2.14.6
diskcache>=5.6.3
graphviz>=0.20.3
gdown>=5.2.0
gdown>=5.2.0
pillow
httpx
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@

setup(
name="textgrad",
version="0.1.3",
version="0.1.4",
description="",
python_requires=">=3.8",
python_requires=">=3.9",
classifiers=[
"Development Status :: 2 - Pre-Alpha",
"Intended Audience :: Developers",
Expand Down
71 changes: 71 additions & 0 deletions tests/test_basics.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import os
import pytest
from typing import Union, List
import logging


Expand All @@ -18,6 +19,26 @@ def generate(self, prompt, system_prompt=None, **kwargs):
def __call__(self, prompt, system_prompt=None):
return self.generate(prompt)

class DummyMultimodalEngine(EngineLM):

def __init__(self, is_multimodal=False):
self.is_multimodal = is_multimodal
self.model_string = "gpt-4o" # fake

def generate(self, content: Union[str, List[Union[str, bytes]]], system_prompt: str = None, **kwargs):
if isinstance(content, str):
return "Hello Text"

elif isinstance(content, list):
has_multimodal_input = any(isinstance(item, bytes) for item in content)
if (has_multimodal_input) and (not self.is_multimodal):
raise NotImplementedError("Multimodal generation is only supported for Claude-3 and beyond.")

return "Hello Text from Image"

def __call__(self, prompt, system_prompt=None):
return self.generate(prompt)

# Idempotent engine that returns the prompt as is
class IdempotentEngine(EngineLM):
def generate(self, prompt, system_prompt=None, **kwargs):
Expand Down Expand Up @@ -124,3 +145,53 @@ def test_formattedllmcall():
assert inputs["question"] in output.predecessors
assert inputs["prediction"] in output.predecessors
assert output.get_role_description() == "test response"


def test_multimodal():
from textgrad.autograd import MultimodalLLMCall, LLMCall
from textgrad import Variable
import httpx

image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image_data = httpx.get(image_url).content

os.environ['OPENAI_API_KEY'] = "fake_key"
engine = DummyMultimodalEngine(is_multimodal=True)

image_variable = Variable(image_data,
role_description="image to answer a question about", requires_grad=False)

text = Variable("Hello", role_description="A variable")
question_variable = Variable("What do you see in this image?", role_description="question", requires_grad=False)
response = MultimodalLLMCall(engine=engine)([image_variable, question_variable])

assert response.value == "Hello Text from Image"

response = LLMCall(engine=engine)(text)

assert response.value == "Hello Text"

## llm call cannot handle images
with pytest.raises(AttributeError):
response = LLMCall(engine=engine)([text, image_variable])

# this is just to check the content, we can't really have int variables but
# it's just for testing purposes
with pytest.raises(AssertionError):
response = MultimodalLLMCall(engine=engine)([Variable(4, role_description="tst"),
Variable(5, role_description="tst")])

def test_multimodal_from_url():
from textgrad import Variable
import httpx

image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
image_data = httpx.get(image_url).content

image_variable = Variable(image_path=image_url,
role_description="image to answer a question about", requires_grad=False)

image_variable_2 = Variable(image_data,
role_description="image to answer a question about", requires_grad=False)

assert image_variable_2.value == image_variable.value
1 change: 1 addition & 0 deletions textgrad/autograd/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from .functional import sum, aggregate
from .llm_ops import LLMCall, FormattedLLMCall, LLMCall_with_in_context_examples
from .multimodal_ops import MultimodalLLMCall, OrderedFieldsMultimodalLLMCall
from .function import Module
from .string_based_ops import StringBasedFunction
Loading
Loading