Skip to content

Commit

Permalink
docs(quickstart): update quickstart (#88)
Browse files Browse the repository at this point in the history
  • Loading branch information
micpst authored Aug 29, 2024
1 parent 139ab9b commit 9a1e9d8
Show file tree
Hide file tree
Showing 12 changed files with 36 additions and 34 deletions.
2 changes: 1 addition & 1 deletion docs/concepts/similarity_indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ A similarity index consists of two main components:

The concept of "similarity" is deliberately broad, as it varies depending on the store's implementation to best fit the use case. Db-ally not only supports custom store implementations but also includes various built-in implementations, from simple case-insensitive text matches to more complex embedding-based semantic similarities.

See the [Quickstart Part 2: Semantic Similarity](../quickstart/quickstart2.md) for an example of using a similarity index with a semantic similarity store.
See the [Quickstart Part 2: Semantic Similarity](../quickstart/semantic-similarity.md) for an example of using a similarity index with a semantic similarity store.
2 changes: 1 addition & 1 deletion docs/how-to/update_similarity_indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The Similarity Index is a feature provided by db-ally that takes user input and maps it to the closest matching value in the data source using a chosen similarity metric. This feature is handy when the user input does not exactly match the data source, such as when the user asks to "list all employees in the IT department," while the database categorizes this group as the "computer department." To learn more about Similarity Indexes, refer to the [Concept: Similarity Indexes](../concepts/similarity_indexes.md) page.

While Similarity Indexes can be used directly, they are usually used with [Views](../concepts/views.md), annotating arguments to filter methods. This technique lets db-ally automatically match user-provided arguments to the most similar value in the data source. You can see an example of using similarity indexes with views on the [Quickstart Part 2: Semantic Similarity](../quickstart/quickstart2.md) page.
While Similarity Indexes can be used directly, they are usually used with [Views](../concepts/views.md), annotating arguments to filter methods. This technique lets db-ally automatically match user-provided arguments to the most similar value in the data source. You can see an example of using similarity indexes with views on the [Quickstart Part 2: Semantic Similarity](../quickstart/semantic-similarity.md) page.

Similarity Indexes are designed to index all possible values (e.g., on disk or in a different data store). Consequently, when the data source undergoes changes, the Similarity Index must update to reflect these alterations. This guide will explain how to update Similarity Indexes in your code.

Expand Down
4 changes: 2 additions & 2 deletions docs/how-to/use_custom_similarity_fetcher.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ In this example, we used the FaissStore, which utilizes the `faiss` library for

## Using the Similarity Index

You can use the index with a custom fetcher [the same way](../quickstart/quickstart2.md) as you would with a built-in fetcher. The similarity index will map user input to the closest matching value from your data source, allowing you to deliver more precise responses to user queries. Remember to frequently update the similarity index with new values from your data source to maintain its relevance. You can accomplish this by calling the `update` method on the similarity index.
You can use the index with a custom fetcher [the same way](../quickstart/semantic-similarity.md) as you would with a built-in fetcher. The similarity index will map user input to the closest matching value from your data source, allowing you to deliver more precise responses to user queries. Remember to frequently update the similarity index with new values from your data source to maintain its relevance. You can accomplish this by calling the `update` method on the similarity index.

```python
await breeds_similarity.update()
Expand All @@ -72,4 +72,4 @@ print(await breeds_similarity.similar("bagle"))

This will return the most similar dog breed to "bagle" based on the data retrieved from the dog.ceo API - in this case, "beagle".

In general, instead of directly calling the similarity index, you would usually use it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/quickstart2.md).
In general, instead of directly calling the similarity index, you would usually use it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/semantic-similarity.md).
6 changes: 3 additions & 3 deletions docs/how-to/use_custom_similarity_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,11 +56,11 @@ country_similarity = SimilarityIndex(
)
```

In this example, we used the sample `DogBreedsFetcher` fetcher detailed in the [custom fetcher guide](./use_custom_similarity_fetcher.md) and the `PickleStore` to store the values in a Python pickle file. You can use a different fetcher depending on your needs, for example [the Sqlalchemy one described in the Quickstart guide](../quickstart/quickstart2.md)).
In this example, we used the sample `DogBreedsFetcher` fetcher detailed in the [custom fetcher guide](./use_custom_similarity_fetcher.md) and the `PickleStore` to store the values in a Python pickle file. You can use a different fetcher depending on your needs, for example [the Sqlalchemy one described in the Quickstart guide](../quickstart/semantic-similarity.md)).

## Using the Similarity Index

You can use an index with a custom store [the same way](../quickstart/quickstart2.md) you would use one with a built-in store. The similarity index will map user input to the closest matching value from your data source, enabling you to deliver more accurate responses. It's important to regularly update the similarity index with new values from your data source to keep it current. Do this by invoking the `update` method on the similarity index.
You can use an index with a custom store [the same way](../quickstart/semantic-similarity.md) you would use one with a built-in store. The similarity index will map user input to the closest matching value from your data source, enabling you to deliver more accurate responses. It's important to regularly update the similarity index with new values from your data source to keep it current. Do this by invoking the `update` method on the similarity index.

```python
await country_similarity.update()
Expand All @@ -77,4 +77,4 @@ print(await country_similarity.similar("bagle"))

This will return the closest matching dog breed to "bagle" - in this case, "beagle".

Typically, instead of directly invoking the similarity index, you would employ it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/quickstart2.md).
Typically, instead of directly invoking the similarity index, you would employ it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/semantic-similarity.md).
6 changes: 3 additions & 3 deletions docs/quickstart/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ class CandidateView(SqlAlchemyBaseView):
By setting up these filters, you enable the LLM to fetch candidates while optionally applying filters based on experience, country, and eligibility for a senior data scientist position.

!!! note
The `from_country` filter defined above supports only exact matches, which is not always ideal. Thankfully, db-ally comes with a solution for this problem - Similarity Indexes, which can be used to find the most similar value from the ones available. Refer to [Quickstart Part 2: Semantic Similarity](./quickstart2.md) for an example of using semantic similarity when filtering candidates by country.
The `from_country` filter defined above supports only exact matches, which is not always ideal. Thankfully, db-ally comes with a solution for this problem - Similarity Indexes, which can be used to find the most similar value from the ones available. Refer to [Quickstart Part 2: Semantic Similarity](./semantic-similarity.md) for an example of using semantic similarity when filtering candidates by country.

## OpenAI Access Configuration

Expand Down Expand Up @@ -170,8 +170,8 @@ Retrieved 1 candidates:

## Full Example

Access the full example here: [quickstart_code.py](quickstart_code.py)
Access the full example on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/intro.py){:target="_blank"}.

## Next Steps

Explore [Quickstart Part 2: Semantic Similarity](./quickstart2.md) to expand on the example and learn about using semantic similarity.
Explore [Quickstart Part 2: Semantic Similarity](./semantic-similarity.md) to expand on the example and learn about using semantic similarity.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Quickstart: Multiple Views

This guide continues from [Semantic Similarity](./quickstart2.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code here: [quickstart2_code.py](quickstart2_code.py).
This guide continues from [Semantic Similarity](./semantic-similarity.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/semantic_similarity.py){:target="_blank"}.

The guide illustrates how to use multiple views to handle queries requiring different types of data. `CandidateView` and `JobView` are used as examples.

Expand Down Expand Up @@ -124,7 +124,7 @@ Julia Nowak - Adobe XD;Sketch;Figma
Anna Kowalska - AWS;Azure;Google Cloud
```

That wraps it up! You can find the full example code here: [quickstart3_code.py](quickstart3_code.py).
That wraps it up! You can find the full example code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/multiple-views.py){:target="_blank"}.

## Next Steps
Visit the [Tutorial](../tutorials.md) for a more comprehensive guide on how to use db-ally.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Quickstart: Semantic Similarity

This guide is a continuation of the [Intro](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code here: [quickstart_code.py](quickstart_code.py).
This guide is a continuation of the [Intro](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/intro.py){:target="_blank"}.

This guide will demonstrate how to use semantic similarity to handle queries in which the filter values are similar to those in the database, without requiring an exact match. We will use filtering by country as an example.

Expand Down Expand Up @@ -146,8 +146,8 @@ Retrieved 1 candidates:

That's it! You can apply similar techniques to any other filter that takes a string value.

To see the full example, you can find the code here: [quickstart2_code.py](quickstart2_code.py).
To see the full example, you can find the code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/semantic_similarity.py){:target="_blank"}.

## Next Steps

Explore [Quickstart Part 3: Multiple Views](./quickstart3.md) to learn how to run queries with multiple views and display the results based on the view that was used to fetch the data.
Explore [Quickstart Part 3: Multiple Views](./multiple-views.md) to learn how to run queries with multiple views and display the results based on the view that was used to fetch the data.
2 changes: 1 addition & 1 deletion docs/reference/similarity/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,6 @@ Explore [Similarity Stores](./similarity_store/index.md) and [Similarity Fetcher
* [How-To: Use Similarity Indexes with Data from Custom Sources](../../how-to/use_custom_similarity_fetcher.md)
* [How-To: Store Similarity Index in a Custom Store](../../how-to/use_custom_similarity_store.md)
* [How-To: Update Similarity Indexes](../../how-to/update_similarity_indexes.md)
* [Quickstart: Semantic Similarity](../../quickstart/quickstart2.md)
* [Quickstart: Semantic Similarity](../../quickstart/semantic-similarity.md)

::: dbally.similarity.SimilarityIndex
8 changes: 4 additions & 4 deletions docs/quickstart/quickstart_code.py → examples/intro.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring
import dbally
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code

import asyncio

import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base

from dbally import decorators, SqlAlchemyBaseView
import dbally
from dbally import SqlAlchemyBaseView, decorators
from dbally.audit.event_handlers.cli_event_handler import CLIEventHandler
from dbally.llms.litellm import LiteLLM


engine = create_engine("sqlite:///examples/recruiting/data/candidates.db")

Base = automap_base()
Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring
import os
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code

import asyncio
from typing_extensions import Annotated
import os

import pandas as pd
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base
import pandas as pd
from typing_extensions import Annotated

import dbally
from dbally import decorators, SqlAlchemyBaseView, DataFrameBaseView, ExecutionResult
from dbally import DataFrameBaseView, ExecutionResult, SqlAlchemyBaseView, decorators
from dbally.audit import CLIEventHandler
from dbally.similarity import SimpleSqlAlchemyFetcher, FaissStore, SimilarityIndex
from dbally.embeddings.litellm import LiteLLMEmbeddingClient
from dbally.llms.litellm import LiteLLM
from dbally.similarity import FaissStore, SimilarityIndex, SimpleSqlAlchemyFetcher

engine = create_engine("sqlite:///examples/recruiting/data/candidates.db")

Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring
import os
# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code

import asyncio
from typing_extensions import Annotated
import os

from dotenv import load_dotenv
import sqlalchemy
from dotenv import load_dotenv
from sqlalchemy import create_engine
from sqlalchemy.ext.automap import automap_base
from typing_extensions import Annotated

import dbally
from dbally import decorators, SqlAlchemyBaseView
from dbally import SqlAlchemyBaseView, decorators
from dbally.audit.event_handlers.cli_event_handler import CLIEventHandler
from dbally.similarity import SimpleSqlAlchemyFetcher, FaissStore, SimilarityIndex
from dbally.embeddings.litellm import LiteLLMEmbeddingClient
from dbally.llms.litellm import LiteLLM
from dbally.similarity import FaissStore, SimilarityIndex, SimpleSqlAlchemyFetcher

load_dotenv()
engine = create_engine("sqlite:///examples/recruiting/data/candidates.db")
Expand Down
4 changes: 2 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ nav:
- db-ally docs: index.md
- Quickstart:
- quickstart/index.md
- quickstart/quickstart2.md
- quickstart/quickstart3.md
- quickstart/semantic-similarity.md
- quickstart/multiple-views.md
- Concepts:
- concepts/views.md
- concepts/structured_views.md
Expand Down

0 comments on commit 9a1e9d8

Please sign in to comment.