Skip to content

Commit

Permalink
Add Google AI integration (#113)
Browse files Browse the repository at this point in the history
  • Loading branch information
silvanocerza authored Jan 10, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent 4f017ad commit 79481a4
Showing 2 changed files with 215 additions and 0 deletions.
215 changes: 215 additions & 0 deletions integrations/google-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,215 @@
---
layout: integration
name: Google AI
description: Use Google AI Models with Haystack
authors:
- name: deepset
socials:
github: deepset-ai
twitter: deepset_ai
linkedin: deepset-ai
pypi: https://pypi.org/project/google-ai-haystack/
repo: https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/google_ai
type: Model Provider
report_issue: https://github.com/deepset-ai/haystack-core-integrations/issues
logo: /logos/googleai.png
version: Haystack 2.0
toc: true
---

### Table of Contents

- [Overview](#overview)
- [Installation](#installation)
- [Usage](#usage)
- [Multimodality with `gemini-pro-vision`](#multimodality-with-gemini-pro-vision)
- [Function calling](#function-calling)
- [Code generation](#code-generation)

## Overview

[Google AI](https://ai.google.dev/) is a machine learning (ML) platform that lets you train and deploy ML models and AI applications, and customize large language models (LLMs) for use in your AI-powered applications. This integration enables the usage of Google generative models via their Makersuite REST API.

Haystack supports all the available [multimodal Gemini models](https://ai.google.dev/models/gemini) for tasks such as **text generation**, **function calling**, **visual question answering**, **code generation**, and **image captioning**.

## Installation

Install the Google AI integration:

```bash
pip install google-ai-haystack
```

## Usage

Once installed, you will have access to various Haystack Generators:

- [`GoogleAIGeminiGenerator`](https://docs.haystack.deepset.ai/v2.0/docs/googleaigeminigenerator): Use this component with Gemini models '**gemini-pro**', '**gemini-pro-vision**', '**gemini-ultra**' for text generation and multimodal prompts.
- [`GoogleAIGeminiChatGenerator`](https://docs.haystack.deepset.ai/v2.0/docs/googleaigeminichatgenerator): Use this component with Gemini models '**gemini-pro**', '**gemini-pro-vision**' and '**gemini-ultra**' for text generation, multimodal prompts and function calling in chat completion setting.

To use Google Gemini models you need an API key. You can either pass it as init argument or set a `GOOGLE_API_KEY` environment variable. If neither is set you won't be able to use the generators.

To get an API key visit [Google Makersuite](https://makersuite.google.com).

**Text Generation with `gemini-pro`**

To use Gemini model for text generation, initialize a `GoogleAIGeminiGenerator` with `"gemini-pro"` and `api_key`:

```python
from google_ai_haystack.generators.gemini import GoogleAIGeminiGenerator

gemini_generator = GoogleAIGeminiGenerator(model="gemini-pro", api_key=api_key)
result = gemini_generator.run(parts = ["What is assemblage in art?"])
print(result["answers"][0])
```

Output:

```shell
Assemblage in art refers to the creation of a three-dimensional artwork by combining various found objects...
```

### Multimodality with `gemini-pro-vision`

To use `gemini-pro-vision` model for visual question answering, initialize a `GoogleAIGeminiGenerator` with `"gemini-pro-vision"` and `project_id`. Then, run it with the images as well as the prompt:

```python
import requests
from haystack.dataclasses.byte_stream import ByteStream

from google_ai_haystack.generators.gemini import GoogleAIGeminiGenerator

BASE_URL = (
"https://raw.githubusercontent.com/deepset-ai/haystack-core-integrations"
"/main/integrations/google_ai/example_assets"
)

URLS = [
f"{BASE_URL}/robot1.jpg",
f"{BASE_URL}/robot2.jpg",
f"{BASE_URL}/robot3.jpg",
f"{BASE_URL}/robot4.jpg"
]
images = [
ByteStream(data=requests.get(url).content, mime_type="image/jpeg")
for url in URLS
]
gemini_generator = GoogleAIGeminiGenerator(model="gemini-pro-vision", api_key=api_key)
result = gemini_generator.run(parts = ["What can you tell me about these robots?", *images])
for answer in result["answers"]:
print(answer)
```

Output:

```shell
The first image is of C-3PO and R2-D2 from the Star Wars franchise...
The second image is of Maria from the 1927 film Metropolis...
The third image is of Gort from the 1951 film The Day the Earth Stood Still...
The fourth image is of Marvin from the 1977 film The Hitchhiker's Guide to the Galaxy...
```
### Function calling
When chatting with Gemini we can also use function calls.
```python
from google.ai.generativelanguage import FunctionDeclaration, Tool
from haystack.dataclasses import ChatMessage
from google_ai_haystack.generators.chat.gemini import GoogleAIGeminiChatGenerator
# Define a function that return always some nice weather
def get_current_weather(location: str, unit: str = "celsius"):
return {"weather": "sunny", "temperature": 21.8, "unit": unit}
# Class that defines the arguments of a function so Gemini
# knows how it should be called
get_current_weather_func = FunctionDeclaration(
name="get_current_weather",
description="Get the current weather in a given location",
parameters={
"type_": "OBJECT",
"properties": {
"location": {"type_": "STRING", "description": "The city and state, e.g. San Francisco, CA"},
"unit": {
"type_": "STRING",
"enum": [
"celsius",
"fahrenheit",
],
},
},
"required": ["location"],
},
)
tool = Tool(function_declarations=[get_current_weather_func])
gemini_chat = GoogleAIGeminiChatGenerator(
model="gemini-pro", api_key=api_key, tools=[tool]
)
messages = [
ChatMessage.from_user(content="What is the temperature in celsius in Berlin?")
]
res = gemini_chat.run(messages=messages)
weather = get_current_weather(**res["replies"][0].content)
messages += res["replies"] + [
ChatMessage.from_function(content=weather, name="get_current_weather")
]
res = gemini_chat.run(messages=messages)
print(res["replies"][0].content)
```
Will output:
```
In Berlin, the weather is sunny with a temperature of 21.8 degrees Celsius.
```
### Code generation
Gemini can also easily generate code, here's an example:

```python
from google_ai_haystack.generators.gemini import GoogleAIGeminiGenerator
gemini_generator = GoogleAIGeminiGenerator(model="gemini-pro", api_key=api_key)
result = gemini_generator.run("Write a code for calculating fibonacci numbers in JavaScript")
print(result["answers"][0])
```

Output:

```javascript
// Recursive approach
function fibonacciRecursive(n) {
if (n <= 1) {
return n;
} else {
return fibonacciRecursive(n - 1) + fibonacciRecursive(n - 2);
}
}
// Iterative approach
function fibonacciIterative(n) {
if (n <= 1) {
return n;
}
let fibSequence = [0, 1];
while (fibSequence.length < n + 1) {
let nextNumber =
fibSequence[fibSequence.length - 1] + fibSequence[fibSequence.length - 2];
fibSequence.push(nextNumber);
}
return fibSequence[n];
}
// Usage
console.log(fibonacciRecursive(7)); // Output: 13
console.log(fibonacciIterative(7)); // Output: 13
```
Binary file added logos/googleai.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 79481a4

Please sign in to comment.