Skip to content

Commit

Permalink
Docs: advanced queries dsl (#5435)
Browse files Browse the repository at this point in the history
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

Closes #<issue_number>

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Bug fix (non-breaking change which fixes an issue)
- New feature (non-breaking change which adds functionality)
- Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Francisco Aranda <[email protected]>
  • Loading branch information
3 people authored Aug 30, 2024
1 parent e28ef13 commit 7495136
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 4 deletions.
6 changes: 6 additions & 0 deletions argilla/docs/how_to_guides/annotate.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,12 @@ The UI offers various features designed for data exploration and understanding.

From the **control panel** at the top of the left pane, you can search by keyword across the entire dataset. If you have more than one field in your records, you may specify if the search is to be performed “All” fields or on a specific one. Matched results are highlighted in color.

!!! note
If you introduce more than one keyword, the search will return results where **all** keywords have a match.

!!! tip
For more advanced searches, take a look at the [advanced queries DSL](query.md#advanced-queries).

### Order by record semantic similarity

You can retrieve records based on their similarity to another record if vectors have been added to the dataset.
Expand Down
23 changes: 20 additions & 3 deletions argilla/docs/how_to_guides/query.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ You can search for records in your dataset by **querying** or **filtering**. The

To search for records with terms, you can use the `Dataset.records` attribute with a query string. The search terms are used to search for records that contain the terms in the text field. You can search a single term or various terms, in the latter, all of them should appear in the record to be retrieved.

=== "Single search term"
=== "Single term search"

```python
import argilla as rg
Expand All @@ -49,7 +49,7 @@ To search for records with terms, you can use the `Dataset.records` attribute wi
queried_records = dataset.records(query=query).to_list(flatten=True)
```

=== "Multiple search term"
=== "Multiple terms search"

```python
import argilla as rg
Expand All @@ -63,6 +63,23 @@ To search for records with terms, you can use the `Dataset.records` attribute wi
queried_records = dataset.records(query=query).to_list(flatten=True)
```

### Advanced queries

If you need more complex searches, you can use [Elasticsearch's simple query string syntax](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html#simple-query-string-syntax). Here is a summary of the different available operators:

| operator | description | example |
| ------------ | --------------------------- | --------------------------------------------------------------------- |
|`+` or `space`| **AND**: search both terms | `argilla + distilabel` or `argilla distilabel`</br> return records that include the terms "argilla" and "distilabel"|
|`|` | **OR**: search either term | `argilla | distilabel` </br> returns records that include the term "argilla" or "distilabel"|
|`-` | **Negation**: exclude a term| `argilla -distilabel` </br> returns records that contain the term "argilla" and don't have the term "distilabel"|
|`*` | **Prefix**: search a prefix | `arg*`</br> returns records with any words starting with "arg-"|
|`"` | **Phrase**: search a phrase | `"argilla and distilabel"` </br> returns records that contain the phrase "argilla and distilabel"|
|`(` and `)` | **Precedence**: group terms | `(argilla | distilabel) rules` </br> returns records that contain either "argilla" or "distilabel" and "rules"|
|`~N` | **Edit distance**: search a term or phrase with an edit distance| `argilla~1` </br> returns records that contain the term "argilla" with an edit distance of 1, e.g. "argila"|

!!! tip
To use one of these characters literally, escape it with a preceding backslash `\`, e.g. `"1 \+ 2"` would match records where the phrase "1 + 2" is found.

## Filter by conditions

You can use the `Filter` class to define the conditions and pass them to the `Dataset.records` attribute to fetch records based on the conditions. Conditions include "==", ">=", "<=", or "in". Conditions can be combined with dot notation to filter records based on metadata, suggestions, or responses. You can use a single condition or multiple conditions to filter records.
Expand All @@ -72,7 +89,7 @@ You can use the `Filter` class to define the conditions and pass them to the `Da
| `==` | The `field` value is equal to the `value` |
| `>=` | The `field` value is greater than or equal to the `value` |
| `<=` | The `field` value is less than or equal to the `value` |
| `in` | TThe `field` value is included in a list of values |
| `in` | The `field` value is included in a list of values |

=== "Single condition"

Expand Down
4 changes: 3 additions & 1 deletion argilla/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ plugins:
- docs/scripts/gen_changelog.py
- docs/scripts/gen_popular_issues.py
# - docs/scripts/gen_ref_pages.py
enabled: !ENV [CI, false] # enables the plugin only during continuous integration (CI), disabled on local build
- literate-nav:
nav_file: SUMMARY.md
- section-index
Expand Down Expand Up @@ -148,7 +149,8 @@ plugins:
# Signature
separate_signature: false
show_signature_annotations: false
- social
- social:
enabled: !ENV [CI, false] # enables the plugin only during continuous integration (CI), disabled on local build
- mknotebooks
- material-plausible

Expand Down

0 comments on commit 7495136

Please sign in to comment.