Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(views): add documentation of freeform views #21

Merged
merged 3 commits into from
Apr 25, 2024

Conversation

ludwiktrammer
Copy link
Contributor

No description provided.

Copy link

Trivy scanning results.

@ludwiktrammer ludwiktrammer marked this pull request as draft April 23, 2024 12:01
Copy link

github-actions bot commented Apr 23, 2024

badge

Code Coverage Summary

Filename                                                         Stmts    Miss  Cover    Missing
-------------------------------------------------------------  -------  ------  -------  -------------------------------------------------
dbally/_main.py                                                     12       0  100.00%
dbally/collection.py                                                92       2  97.83%   216, 244
dbally/assistants/base.py                                           23       0  100.00%
dbally/assistants/openai.py                                         59       2  96.61%   59-76
dbally/audit/event_span.py                                           8       2  75.00%   12, 22
dbally/audit/event_tracker.py                                       34      11  67.65%   35, 48, 59, 69, 83-92
dbally/audit/event_handlers/base.py                                 15       0  100.00%
dbally/audit/event_handlers/cli_event_handler.py                    42      27  35.71%   9-12, 38-39, 42-46, 56-58, 70-78, 90-93, 104-107
dbally/audit/event_handlers/langsmith_event_handler.py              25      21  16.00%   6-92
dbally/data_models/audit.py                                         22       0  100.00%
dbally/data_models/execution_result.py                              14       0  100.00%
dbally/data_models/llm_options.py                                   13       0  100.00%
dbally/data_models/prompts/common_validation_utils.py               15       0  100.00%
dbally/data_models/prompts/iql_prompt_template.py                   13       1  92.31%   39
dbally/data_models/prompts/nl_responder_prompt_template.py           8       0  100.00%
dbally/data_models/prompts/prompt_template.py                       28       2  92.86%   27, 35
dbally/data_models/prompts/query_explainer_prompt_template.py        8       0  100.00%
dbally/data_models/prompts/view_selector_prompt_template.py         12       2  83.33%   33-34
dbally/embedding_client/base.py                                      5       0  100.00%
dbally/embedding_client/openai.py                                   19      14  26.32%   20-30, 42-52
dbally/iql/_exceptions.py                                           24       1  95.83%   33
dbally/iql/_processor.py                                            78       7  91.03%   15, 43, 60, 66, 72, 81, 87
dbally/iql/_query.py                                                12       1  91.67%   7
dbally/iql/_type_validators.py                                      39       2  94.87%   24, 28
dbally/iql/syntax.py                                                36       9  75.00%   6-9, 27, 36, 60, 63-66
dbally/iql_generator/iql_generator.py                               32       0  100.00%
dbally/llm_client/base.py                                           21      10  52.38%   24-25, 50-70
dbally/llm_client/openai_client.py                                  21      21  0.00%    1-62
dbally/nl_responder/nl_responder.py                                 31       5  83.87%   75, 82-90
dbally/nl_responder/token_counters.py                               23      10  56.52%   24-25, 53-63
dbally/prompts/prompt_builder.py                                    20       3  85.00%   6, 26-27
dbally/similarity/chroma_store.py                                   35       0  100.00%
dbally/similarity/detector.py                                       73       6  91.78%   21, 200-204
dbally/similarity/faiss_store.py                                    36      33  8.33%    5-94
dbally/similarity/fetcher.py                                         5       0  100.00%
dbally/similarity/index.py                                          18       0  100.00%
dbally/similarity/sqlalchemy_base.py                                40      21  47.50%   17, 35-37, 48-50, 59, 68-71, 81-87, 105-108
dbally/similarity/store.py                                           7       0  100.00%
dbally/utils/errors.py                                               2       0  100.00%
dbally/view_selection/base.py                                        6       0  100.00%
dbally/view_selection/llm_view_selector.py                          20       0  100.00%
dbally/view_selection/random_view_selector.py                        9       9  0.00%    1-27
dbally/views/base.py                                                 7       0  100.00%
dbally/views/decorators.py                                           6       0  100.00%
dbally/views/exposed_functions.py                                   33       1  96.97%   24
dbally/views/methods_base.py                                        34       2  94.12%   75, 83
dbally/views/pandas_base.py                                         33       1  96.97%   64
dbally/views/sqlalchemy_base.py                                     37       7  81.08%   48, 63-65, 83-87
dbally/views/structured.py                                          34       1  97.06%   30
dbally/views/freeform/text2sql/_autodiscovery.py                   113      18  84.07%   107-108, 272, 275, 288-300, 304, 319-321, 333-334
dbally/views/freeform/text2sql/_config.py                           20       0  100.00%
dbally/views/freeform/text2sql/_errors.py                            6       3  50.00%   10-12
dbally/views/freeform/text2sql/_view.py                             48       6  87.50%   71, 75-78, 81
dbally_cli/main.py                                                   5       5  0.00%    1-11
dbally_cli/similarity.py                                            21       0  100.00%
TOTAL                                                             1452     266  81.68%

Diff against main

Filename      Stmts    Miss  Cover
----------  -------  ------  --------
TOTAL             0       0  +100.00%

Results for commit: 2660222

Minimum allowed coverage is 60%

♻️ This comment has been updated with latest results

# Concept: Freeform Views

Freeform views are a type of [view](views.md) that provides a way for developers using db-ally to define what they need from the LLM without requiring a fixed response structure. This flexibility is beneficial when the data structure is unknown beforehand or when potential queries are too diverse to be covered by a structured view. Though freeform views offer more flexibility than structured views, they are less predictable, efficient, and secure, and may be more challenging to integrate with other systems. For these reasons, we recommend using [structured views](./structured_views.md) when possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a sentence or two about strategy of:

  1. Starting with FreeFormView
  2. Collecting statistics about most common questions and failure cases
  3. Incorporating StructuredViews to overcome them


Freeform views are a type of [view](views.md) that provides a way for developers using db-ally to define what they need from the LLM without requiring a fixed response structure. This flexibility is beneficial when the data structure is unknown beforehand or when potential queries are too diverse to be covered by a structured view. Though freeform views offer more flexibility than structured views, they are less predictable, efficient, and secure, and may be more challenging to integrate with other systems. For these reasons, we recommend using [structured views](./structured_views.md) when possible.

Unlike structured views, which define a response format and a set of operations the LLM may use in response to natural language queries, freeform views only have one task - to respond directly to natural language queries with data from the datasource. They accomplish this by implementing the [`ask`][dbally.views.base.BaseView] method. This method takes a natural language query as input and returns a response. The method also has access to the LLM model (via the `llm_client` attribute), which is typically used to retrieve the correct data from the source (for example, by generating a source-specific query string). To learn more about implementing freeform views, refer to the [How to: Custom Freeform Views](../how-to/custom_freeform_views.md) guide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

freeform views only have one task -> Isn't the task of StructuredView the same?

I mean we can also run freeform in the dry mode and obtain directly the sql or api call.

The main difference is in the IQL generation by the StructuredView so I'd emphasise it


For instance, an LLM might generate an IQL query like this when asked "Find me French candidates suitable for a senior data scientist position":

```
from_country('France') AND senior_data_scientist_position()
```

The capabilities made available to the AI model via IQL differ between projects. Developers control these by defining [Views](views.md). db-ally automatically exposes special methods defined in views, known as "filters", via IQL. For instance, the expression above suggests that the specific project contains a view that includes the `from_country` and `senior_data_scientist_position` methods (and possibly others that the LLM did not choose to use for this particular question). Additionally, the LLM can use Boolean operators (`and`,`or`, `not`) to combine individual filters into more complex expressions.
The capabilities made available to the AI model via IQL differ between projects. Developers control these by defining special [Views](structured_views.md). db-ally automatically exposes special methods defined in structured views, known as "filters", via IQL. For instance, the expression above suggests that the specific project contains a view that includes the `from_country` and `senior_data_scientist_position` methods (and possibly others that the LLM did not choose to use for this particular question). Additionally, the LLM can use Boolean operators (`and`,`or`, `not`) to combine individual filters into more complex expressions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Word special somehow implied "hard" when I was reading it, maybe just use the structured to familiarize users with this term

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it will create a repetition, but sometimes they are not that bad

@@ -0,0 +1,39 @@
# Concept: Structured Views

Structured views are a type of [view](../concepts/views.md), which provide a way for developers using db-ally to define what they need from the LLM, including:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change "are a type of a view" into something more like -> Implements [View] interface, providing more controlled approach of using db-ally, including ?

Structured views are a type of [view](../concepts/views.md), which provide a way for developers using db-ally to define what they need from the LLM, including:

* The desired data structure, such as the specific fields to include from the data source.
* A set of operations the LLM may employ in response to natural language queries (currently only “filters” are supported, with more to come)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see alink to filters but we do not have any page, so maybe we should create one? Maybe a header in the IQL concept page?


Given different natural language queries, a db-ally view will produce different responses while maintaining a consistent data structure. This consistency offers a reliable interface for integration - the code consuming responses from a particular structured view knows what data structure to expect and can utilize this knowledge when displaying or processing the data. This feature of db-ally makes it stand out in terms of reliability and stability compared to standard text-to-SQL approaches.

Each structured view can contain one or more “filters”, which the LLM may decide to choose and apply to the extracted data so that it meets the criteria specified in the natural language query. Given such a query, LLM chooses which filters to use, provides arguments to the filters, and connects the filters with Boolean operators. The LLM expresses these filter combinations using a special language called [IQL](iql.md), in which the defined view filters provide a layer of abstraction between the LLM and the raw syntax used to query the data source (e.g., SQL).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here filters are clearly explained but I think there should be separate header in other section explaining just filters

return Candidate.country == country
```

In addition to structured views, db-ally also provides [freeform views](freeform_views.md), which are more flexible and can be used to create views that do not require a fixed data structure. Freeform views come in handy when the data structure is not predefined or when the scope of potential queries is too vast to be addressed by a structured view. Conversely, structured views are more predictable, efficient, secure, and easier to integrate with other systems. Therefore, we recommend using structured views where possible. To read about the advantages and disadvantages of both kinds of views, refer to [Concept: Views](views.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this Freeform vs Structured View into separate section or page so that one can link to it?

@@ -1,37 +1,19 @@
# Concept: Views
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe structured vs unstructured views use-cases and scenarios section shoould go here

@ds-sebastianchwilczynski ds-sebastianchwilczynski marked this pull request as ready for review April 25, 2024 09:19
Copy link
Member

@mhordynski mhordynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

let's merge and work on it further in separate PR

@mhordynski mhordynski merged commit 668b2ce into main Apr 25, 2024
3 checks passed
@mhordynski mhordynski deleted the lt/doc_freeform_views branch June 7, 2024 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants