diff --git a/docs/concepts/similarity_indexes.md b/docs/concepts/similarity_indexes.md index 1c0ede7e..70966c61 100644 --- a/docs/concepts/similarity_indexes.md +++ b/docs/concepts/similarity_indexes.md @@ -11,4 +11,4 @@ A similarity index consists of two main components: The concept of "similarity" is deliberately broad, as it varies depending on the store's implementation to best fit the use case. Db-ally not only supports custom store implementations but also includes various built-in implementations, from simple case-insensitive text matches to more complex embedding-based semantic similarities. -See the [Quickstart Part 2: Semantic Similarity](../quickstart/quickstart2.md) for an example of using a similarity index with a semantic similarity store. +See the [Quickstart Part 2: Semantic Similarity](../quickstart/semantic-similarity.md) for an example of using a similarity index with a semantic similarity store. diff --git a/docs/how-to/update_similarity_indexes.md b/docs/how-to/update_similarity_indexes.md index 763bc1a7..75aaf322 100644 --- a/docs/how-to/update_similarity_indexes.md +++ b/docs/how-to/update_similarity_indexes.md @@ -2,7 +2,7 @@ The Similarity Index is a feature provided by db-ally that takes user input and maps it to the closest matching value in the data source using a chosen similarity metric. This feature is handy when the user input does not exactly match the data source, such as when the user asks to "list all employees in the IT department," while the database categorizes this group as the "computer department." To learn more about Similarity Indexes, refer to the [Concept: Similarity Indexes](../concepts/similarity_indexes.md) page. -While Similarity Indexes can be used directly, they are usually used with [Views](../concepts/views.md), annotating arguments to filter methods. This technique lets db-ally automatically match user-provided arguments to the most similar value in the data source. You can see an example of using similarity indexes with views on the [Quickstart Part 2: Semantic Similarity](../quickstart/quickstart2.md) page. +While Similarity Indexes can be used directly, they are usually used with [Views](../concepts/views.md), annotating arguments to filter methods. This technique lets db-ally automatically match user-provided arguments to the most similar value in the data source. You can see an example of using similarity indexes with views on the [Quickstart Part 2: Semantic Similarity](../quickstart/semantic-similarity.md) page. Similarity Indexes are designed to index all possible values (e.g., on disk or in a different data store). Consequently, when the data source undergoes changes, the Similarity Index must update to reflect these alterations. This guide will explain how to update Similarity Indexes in your code. diff --git a/docs/how-to/use_custom_similarity_fetcher.md b/docs/how-to/use_custom_similarity_fetcher.md index b79b6554..bcdbf0dd 100644 --- a/docs/how-to/use_custom_similarity_fetcher.md +++ b/docs/how-to/use_custom_similarity_fetcher.md @@ -55,7 +55,7 @@ In this example, we used the FaissStore, which utilizes the `faiss` library for ## Using the Similarity Index -You can use the index with a custom fetcher [the same way](../quickstart/quickstart2.md) as you would with a built-in fetcher. The similarity index will map user input to the closest matching value from your data source, allowing you to deliver more precise responses to user queries. Remember to frequently update the similarity index with new values from your data source to maintain its relevance. You can accomplish this by calling the `update` method on the similarity index. +You can use the index with a custom fetcher [the same way](../quickstart/semantic-similarity.md) as you would with a built-in fetcher. The similarity index will map user input to the closest matching value from your data source, allowing you to deliver more precise responses to user queries. Remember to frequently update the similarity index with new values from your data source to maintain its relevance. You can accomplish this by calling the `update` method on the similarity index. ```python await breeds_similarity.update() @@ -72,4 +72,4 @@ print(await breeds_similarity.similar("bagle")) This will return the most similar dog breed to "bagle" based on the data retrieved from the dog.ceo API - in this case, "beagle". -In general, instead of directly calling the similarity index, you would usually use it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/quickstart2.md). \ No newline at end of file +In general, instead of directly calling the similarity index, you would usually use it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/semantic-similarity.md). \ No newline at end of file diff --git a/docs/how-to/use_custom_similarity_store.md b/docs/how-to/use_custom_similarity_store.md index d98b0219..7af70cb1 100644 --- a/docs/how-to/use_custom_similarity_store.md +++ b/docs/how-to/use_custom_similarity_store.md @@ -56,11 +56,11 @@ country_similarity = SimilarityIndex( ) ``` -In this example, we used the sample `DogBreedsFetcher` fetcher detailed in the [custom fetcher guide](./use_custom_similarity_fetcher.md) and the `PickleStore` to store the values in a Python pickle file. You can use a different fetcher depending on your needs, for example [the Sqlalchemy one described in the Quickstart guide](../quickstart/quickstart2.md)). +In this example, we used the sample `DogBreedsFetcher` fetcher detailed in the [custom fetcher guide](./use_custom_similarity_fetcher.md) and the `PickleStore` to store the values in a Python pickle file. You can use a different fetcher depending on your needs, for example [the Sqlalchemy one described in the Quickstart guide](../quickstart/semantic-similarity.md)). ## Using the Similarity Index -You can use an index with a custom store [the same way](../quickstart/quickstart2.md) you would use one with a built-in store. The similarity index will map user input to the closest matching value from your data source, enabling you to deliver more accurate responses. It's important to regularly update the similarity index with new values from your data source to keep it current. Do this by invoking the `update` method on the similarity index. +You can use an index with a custom store [the same way](../quickstart/semantic-similarity.md) you would use one with a built-in store. The similarity index will map user input to the closest matching value from your data source, enabling you to deliver more accurate responses. It's important to regularly update the similarity index with new values from your data source to keep it current. Do this by invoking the `update` method on the similarity index. ```python await country_similarity.update() @@ -77,4 +77,4 @@ print(await country_similarity.similar("bagle")) This will return the closest matching dog breed to "bagle" - in this case, "beagle". -Typically, instead of directly invoking the similarity index, you would employ it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/quickstart2.md). \ No newline at end of file +Typically, instead of directly invoking the similarity index, you would employ it to annotate arguments to views, as demonstrated in the [Quickstart guide](../quickstart/semantic-similarity.md). \ No newline at end of file diff --git a/docs/quickstart/index.md b/docs/quickstart/index.md index 764f351b..bd7bcb68 100644 --- a/docs/quickstart/index.md +++ b/docs/quickstart/index.md @@ -97,7 +97,7 @@ class CandidateView(SqlAlchemyBaseView): By setting up these filters, you enable the LLM to fetch candidates while optionally applying filters based on experience, country, and eligibility for a senior data scientist position. !!! note - The `from_country` filter defined above supports only exact matches, which is not always ideal. Thankfully, db-ally comes with a solution for this problem - Similarity Indexes, which can be used to find the most similar value from the ones available. Refer to [Quickstart Part 2: Semantic Similarity](./quickstart2.md) for an example of using semantic similarity when filtering candidates by country. + The `from_country` filter defined above supports only exact matches, which is not always ideal. Thankfully, db-ally comes with a solution for this problem - Similarity Indexes, which can be used to find the most similar value from the ones available. Refer to [Quickstart Part 2: Semantic Similarity](./semantic-similarity.md) for an example of using semantic similarity when filtering candidates by country. ## OpenAI Access Configuration @@ -170,8 +170,8 @@ Retrieved 1 candidates: ## Full Example -Access the full example here: [quickstart_code.py](quickstart_code.py) +Access the full example on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/intro.py){:target="_blank"}. ## Next Steps -Explore [Quickstart Part 2: Semantic Similarity](./quickstart2.md) to expand on the example and learn about using semantic similarity. \ No newline at end of file +Explore [Quickstart Part 2: Semantic Similarity](./semantic-similarity.md) to expand on the example and learn about using semantic similarity. \ No newline at end of file diff --git a/docs/quickstart/quickstart3.md b/docs/quickstart/multiple-views.md similarity index 91% rename from docs/quickstart/quickstart3.md rename to docs/quickstart/multiple-views.md index 8a2789d2..b1a860f0 100644 --- a/docs/quickstart/quickstart3.md +++ b/docs/quickstart/multiple-views.md @@ -1,6 +1,6 @@ # Quickstart: Multiple Views -This guide continues from [Semantic Similarity](./quickstart2.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code here: [quickstart2_code.py](quickstart2_code.py). +This guide continues from [Semantic Similarity](./semantic-similarity.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 2 code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/semantic_similarity.py){:target="_blank"}. The guide illustrates how to use multiple views to handle queries requiring different types of data. `CandidateView` and `JobView` are used as examples. @@ -124,7 +124,7 @@ Julia Nowak - Adobe XD;Sketch;Figma Anna Kowalska - AWS;Azure;Google Cloud ``` -That wraps it up! You can find the full example code here: [quickstart3_code.py](quickstart3_code.py). +That wraps it up! You can find the full example code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/multiple-views.py){:target="_blank"}. ## Next Steps Visit the [Tutorial](../tutorials.md) for a more comprehensive guide on how to use db-ally. \ No newline at end of file diff --git a/docs/quickstart/quickstart2.md b/docs/quickstart/semantic-similarity.md similarity index 93% rename from docs/quickstart/quickstart2.md rename to docs/quickstart/semantic-similarity.md index d3bf968a..099c818f 100644 --- a/docs/quickstart/quickstart2.md +++ b/docs/quickstart/semantic-similarity.md @@ -1,6 +1,6 @@ # Quickstart: Semantic Similarity -This guide is a continuation of the [Intro](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code here: [quickstart_code.py](quickstart_code.py). +This guide is a continuation of the [Intro](./index.md) guide. It assumes that you have already set up the views and the collection. If not, please refer to the complete Part 1 code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/intro.py){:target="_blank"}. This guide will demonstrate how to use semantic similarity to handle queries in which the filter values are similar to those in the database, without requiring an exact match. We will use filtering by country as an example. @@ -146,8 +146,8 @@ Retrieved 1 candidates: That's it! You can apply similar techniques to any other filter that takes a string value. -To see the full example, you can find the code here: [quickstart2_code.py](quickstart2_code.py). +To see the full example, you can find the code on [GitHub](https://github.com/deepsense-ai/db-ally/blob/main/examples/semantic_similarity.py){:target="_blank"}. ## Next Steps -Explore [Quickstart Part 3: Multiple Views](./quickstart3.md) to learn how to run queries with multiple views and display the results based on the view that was used to fetch the data. +Explore [Quickstart Part 3: Multiple Views](./multiple-views.md) to learn how to run queries with multiple views and display the results based on the view that was used to fetch the data. diff --git a/docs/reference/similarity/index.md b/docs/reference/similarity/index.md index aca1f119..3f0ae26f 100644 --- a/docs/reference/similarity/index.md +++ b/docs/reference/similarity/index.md @@ -10,6 +10,6 @@ Explore [Similarity Stores](./similarity_store/index.md) and [Similarity Fetcher * [How-To: Use Similarity Indexes with Data from Custom Sources](../../how-to/use_custom_similarity_fetcher.md) * [How-To: Store Similarity Index in a Custom Store](../../how-to/use_custom_similarity_store.md) * [How-To: Update Similarity Indexes](../../how-to/update_similarity_indexes.md) - * [Quickstart: Semantic Similarity](../../quickstart/quickstart2.md) + * [Quickstart: Semantic Similarity](../../quickstart/semantic-similarity.md) ::: dbally.similarity.SimilarityIndex diff --git a/docs/quickstart/quickstart_code.py b/examples/intro.py similarity index 96% rename from docs/quickstart/quickstart_code.py rename to examples/intro.py index ef73cad0..7bce4e46 100644 --- a/docs/quickstart/quickstart_code.py +++ b/examples/intro.py @@ -1,16 +1,16 @@ -# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring -import dbally +# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code + import asyncio import sqlalchemy from sqlalchemy import create_engine from sqlalchemy.ext.automap import automap_base -from dbally import decorators, SqlAlchemyBaseView +import dbally +from dbally import SqlAlchemyBaseView, decorators from dbally.audit.event_handlers.cli_event_handler import CLIEventHandler from dbally.llms.litellm import LiteLLM - engine = create_engine("sqlite:///examples/recruiting/data/candidates.db") Base = automap_base() diff --git a/docs/quickstart/quickstart3_code.py b/examples/multiple_views.py similarity index 95% rename from docs/quickstart/quickstart3_code.py rename to examples/multiple_views.py index 0ad8b1a7..0644de73 100644 --- a/docs/quickstart/quickstart3_code.py +++ b/examples/multiple_views.py @@ -1,19 +1,20 @@ -# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring -import os +# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code + import asyncio -from typing_extensions import Annotated +import os +import pandas as pd import sqlalchemy from sqlalchemy import create_engine from sqlalchemy.ext.automap import automap_base -import pandas as pd +from typing_extensions import Annotated import dbally -from dbally import decorators, SqlAlchemyBaseView, DataFrameBaseView, ExecutionResult +from dbally import DataFrameBaseView, ExecutionResult, SqlAlchemyBaseView, decorators from dbally.audit import CLIEventHandler -from dbally.similarity import SimpleSqlAlchemyFetcher, FaissStore, SimilarityIndex from dbally.embeddings.litellm import LiteLLMEmbeddingClient from dbally.llms.litellm import LiteLLM +from dbally.similarity import FaissStore, SimilarityIndex, SimpleSqlAlchemyFetcher engine = create_engine("sqlite:///examples/recruiting/data/candidates.db") diff --git a/docs/quickstart/quickstart2_code.py b/examples/semantic_similarity.py similarity index 94% rename from docs/quickstart/quickstart2_code.py rename to examples/semantic_similarity.py index d1504cd8..b4a03b66 100644 --- a/docs/quickstart/quickstart2_code.py +++ b/examples/semantic_similarity.py @@ -1,19 +1,20 @@ -# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring -import os +# pylint: disable=missing-return-doc, missing-param-doc, missing-function-docstring, duplicate-code + import asyncio -from typing_extensions import Annotated +import os -from dotenv import load_dotenv import sqlalchemy +from dotenv import load_dotenv from sqlalchemy import create_engine from sqlalchemy.ext.automap import automap_base +from typing_extensions import Annotated import dbally -from dbally import decorators, SqlAlchemyBaseView +from dbally import SqlAlchemyBaseView, decorators from dbally.audit.event_handlers.cli_event_handler import CLIEventHandler -from dbally.similarity import SimpleSqlAlchemyFetcher, FaissStore, SimilarityIndex from dbally.embeddings.litellm import LiteLLMEmbeddingClient from dbally.llms.litellm import LiteLLM +from dbally.similarity import FaissStore, SimilarityIndex, SimpleSqlAlchemyFetcher load_dotenv() engine = create_engine("sqlite:///examples/recruiting/data/candidates.db") diff --git a/mkdocs.yml b/mkdocs.yml index 096954a7..99eccb2d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -8,8 +8,8 @@ nav: - db-ally docs: index.md - Quickstart: - quickstart/index.md - - quickstart/quickstart2.md - - quickstart/quickstart3.md + - quickstart/semantic-similarity.md + - quickstart/multiple-views.md - Concepts: - concepts/views.md - concepts/structured_views.md