Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLM-409 - Removed embeddings arithmetics from documentation. #17

Merged
merged 1 commit into from
Aug 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ build/
!**/src/test/**/build/

### IntelliJ IDEA ###
.idea/workspace.xml
.idea/modules.xml
.idea/jarRepositories.xml
.idea/compiler.xml
Expand Down
149 changes: 0 additions & 149 deletions .idea/workspace.xml

This file was deleted.

79 changes: 0 additions & 79 deletions docs-site/docs/03_components/02_embeddings_space.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,85 +96,6 @@ List<Embedding> closestNeighbors = embeddingsSpace.mostSimilarEmbeddings(embeddi
// closestNeighbors will contain the embeddings for "Hello, eLLMental!" and "Hello, world!"
```

## `calculateRelationshipVector`

Computes a relationship vector for provided text pairs (To be used with the `translateEmbedding` method)

- **Parameters**:
- `textPairs`: Array of text pairs.

For instance, with the following list of text pairs:

| Text 1 | Text 2 |
|--------|--------|
| Man | Woman |
| Boy | Girl |
| King | Queen |
| Prince | Princess |
| Father | Mother |

The relationship vector for this group represents a translation in the embeddings space that, given a word that matches the ones in the left column, provides the location of a word that would likely appear in the right column for the given word. See the documentation for [`translateEmbedding`](#translateEmbedding) for more details.

The `RelationshipVector` class is defined as follows:

```java
public class RelationshipVector {
public final String label;
public final float[] vector;
}
```

And it can be calculated like this:

```java
String[][] textPairs = [["Man", "Woman"], ["Boy", "Girl"], ["King", "Queen"], ["Prince", "Princess"], ["Father", "Mother"]];
RelationshipVector relationshipVector = embeddingsSpace.calculateRelationshipVector(textPairs);
```

## `storeNamedRelationshipVector`

Stores a relationship vector in the embeddings store and assigns it a label for later use.

- **Parameters**:
- `label`: The label to assign to the relationship vector.
- `relationshipVector`: The relationship vector to store.


```java
// First we calculate a relationship vector
RelationshipVector relationshipVector = embeddingsSpace.calculateRelationshipVector(textPairs);

// Then we store it in the embeddings store for future use
embeddingsSpace.storeNamedRelationshipVector("feminize", relationshipVector);
```

## `translateEmbedding`

Shifts a reference text embedding in the embeddings space to find the location of the text that would meet the relationship represented by the vector.

- **Parameters**:
- `referenceText`: The primary embedding.
- `vector`: The vector determining translation.

This is useful if you want to search for embeddings that are similar to a given one, but in a different context. For instance, let's say we have the following embedding:

And a relationship vector calculated with the `calculateRelationshipVector` method as follows:

```java
String[][] textPairs = [["Man", "Woman"], ["Boy", "Girl"], ["King", "Queen"], ["Prince", "Princess"], ["Father", "Mother"]];
RelationshipVector relationshipVector = embeddingsSpace.calculateRelationshipVector(textPairs);
```

We can use the relationship vector to find the location of the words that would be similar to "Cow" instead of "Bull". Notice that embeddings cannot be reversed, and we can't really know if this embedding represents a cow, but it will give us a good approximation that can be used to refine search results later.

```java
// This will create an estimated embedding of the word "Cow"
Embedding likelyACowEmbedding = embeddingsSpace.translateEmbedding("Bull", relationshipVector);

// We use it as any other embedding to find stored texts that are similar to "Cow"
List<Embedding> similarToCowEmbeddings = embeddingsSpace.mostSimilarEmbeddings(likelyACowEmbedding, 5);
```

## `get`

Retrieves an embedding from the embeddings store using its ID.
Expand Down
Loading