Merge branch 'main' into develop

argilla-io · Dec 21, 2023 · 3157945 · 3157945
2 parents e3f0992 + b1cb46c
commit 3157945
Show file tree

Hide file tree

Showing 43 changed files with 8,910 additions and 5,393 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -16,16 +16,22 @@ These are the section headers that we use:
 
 ## [Unreleased]()
 
+## [1.21.0](https://github.com/argilla-io/argilla/compare/v1.20.0...v1.21.0)
+
 ### Added
 
+- Added new draft queue for annotation view ([#4334](https://github.com/argilla-io/argilla/pull/4334))
+- Added annotation metrics module for the `FeedbackDataset` (`argilla.client.feedback.metrics`). ([#4175](https://github.com/argilla-io/argilla/pull/4175)).
 - Added strategy to handle and translate errors from the server for `401` HTTP status code` ([#4362](https://github.com/argilla-io/argilla/pull/4362))
 - Added integration for `textdescriptives` using `TextDescriptivesExtractor` to configure `metadata_properties` in `FeedbackDataset` and `FeedbackRecord`. ([#4400](https://github.com/argilla-io/argilla/pull/4400)). Contributed by @m-newhauser
 - Added `POST /api/v1/me/responses/bulk` endpoint to create responses in bulk for current user. ([#4380](https://github.com/argilla-io/argilla/pull/4380))
 - Added list support for term metadata properties. (Closes [#4359](https://github.com/argilla-io/argilla/issues/4359))
 - Added new CLI task to reindex datasets and records into the search engine. ([#4404](https://github.com/argilla-io/argilla/pull/4404))
+- Added `httpx_extra_kwargs` argument to `rg.init` and `Argilla` to allow passing extra arguments to `httpx.Client` used by `Argilla`. ([#4440](https://github.com/argilla-io/argilla/pull/4441))
 
 ### Changed
 
+- More productive and simpler shortcuts system ([#4215](https://github.com/argilla-io/argilla/pull/4215))
 - Move `ArgillaSingleton`, `init` and `active_client` to a new module `singleton`. ([#4347](https://github.com/argilla-io/argilla/pull/4347))
 - Updated `argilla.load` functions to also work with `FeedbackDataset`s. ([#4347](https://github.com/argilla-io/argilla/pull/4347))
 - [breaking] Updated `argilla.delete` functions to also work with `FeedbackDataset`s. It now raises an error if the dataset does not exist. ([#4347](https://github.com/argilla-io/argilla/pull/4347))
@@ -36,6 +42,10 @@ These are the section headers that we use:
 - Fixed error in `TextClassificationSettings.from_dict` method in which the `label_schema` created was a list of `dict` instead of a list of `str`. ([#4347](https://github.com/argilla-io/argilla/pull/4347))
 - Fixed total records on pagination component ([#4424](https://github.com/argilla-io/argilla/pull/4424))
 
+### Removed
+
+- Removed `draft` auto save for annotation view ([#4334](https://github.com/argilla-io/argilla/pull/4334))
+
 ## [1.20.0](https://github.com/argilla-io/argilla/compare/v1.19.0...v1.20.0)
 
 ### Added

diff --git a/docs/_source/_common/snippets/start_page.md b/docs/_source/_common/snippets/start_page.md
@@ -1,36 +1,88 @@
-::::{tab-set}
+<div class="start-page__intro" markdown="1">
 
-:::{tab-item} Feedback datasets
+# Welcome to
 
-```python
-# install datasets library with pip install datasets
-import argilla as rg
-from datasets import load_dataset
+## Argilla is a platform to build high-quality AI datasets
+
+If you need support join the [Argilla Slack community](https://join.slack.com/t/rubrixworkspace/shared_invite/zt-whigkyjn-a3IUJLD7gDbTZ0rKlvcJ5g)
+
+</div>
+
+<div class="start-page__content" markdown="1">
+
+Get started by publishing your first dataset.
 
-# load an Argilla Feedback Dataset from the Hugging Face Hub
-# look for other datasets at https://huggingface.co/datasets?other=argilla
-dataset = rg.FeedbackDataset.from_huggingface("argilla/oasst_response_quality", split="train")
+### 1. Open an IDE, Jupyter or Collab
 
-# push the dataset to Argilla
-dataset.push_to_argilla("oasst_response_quality")
+If you're a Collab user, you can directly use our [introductory tutorial](https://colab.research.google.com/github/argilla-io/argilla/blob/develop/docs/_source/getting_started/quickstart_workflow_feedback.ipynb).
+
+### 2. Install the SDK with pip
+
+To work with Argilla datasets, you need to use the Argilla SDK. You can install the SDK with pip as follows:
+
+```sh
+pip install argilla -U
 ```
-:::
 
-:::{tab-item} Other datasets
+### 3. Connect to your Argilla server
+
+Get your `ARGILLA_API_URL`:
+
+- If you are using Docker, it is the URL shown in your browser (by default `http://localhost:6900`)
+- If you are using HF Spaces, it should be constructed as follows: `https://[your-owner-name]-[your_space_name].hf.space`
+
+Get your `ARGILLA_API_KEY` you find in ["My settings"](/user-settings) and copy the API key.
+
+Make sure to replace `ARGILLA_API_URL` and `ARGILLA_API_KEY` in the code below. If you are using a private HF Space, you need to specify your `HF_TOKEN` which can be found [here](https://huggingface.co/settings/tokens).
 
 ```python
-# install datasets library with pip install datasets
 import argilla as rg
-from datasets import load_dataset
 
-# load dataset from the hub
-dataset = load_dataset("argilla/gutenberg_spacy-ner", split="train")
+rg.init(
+    api_url="ARGILLA_API_URL",
+    api_key="ARGILLA_API_KEY",
+    # extra_headers={"Authorization": f"Bearer {"HF_TOKEN"}"}
+)
+```
+
+### 4. Create your first dataset
+
+Specify a workspace where the dataset will be created. Check your workspaces in ["My settings"](/user_settings). To create a new workspace, check the [docs](https://docs.argilla.io/en/latest/getting_started/installation/configurations/workspace_management.html).
+
+Create a Dataset with two labels ("sadness" and "joy"). Don't forget to replace "<your-workspace>". Here, we are using a task template, check the docs to [create a fully custom dataset](https://docs.argilla.io/en/latest/practical_guides/create_update_dataset/create_dataset.html).
+
+```python
+dataset = rg.FeedbackDataset.for_text_classification(
+    labels=["sadness", "joy"],
+    multi_label=False,
+    use_markdown=True,
+    guidelines=None,
+    metadata_properties=None,
+    vectors_settings=None,
+)
+dataset.push_to_argilla(name="my-first-dataset", workspace="<your-workspace>")
+```
+
+### 5. Add records
 
-# read in dataset, assuming its a dataset for token classification
-dataset_rg = rg.read_datasets(dataset, task="TokenClassification")
+Create a list with the records you want to add. Ensure that you match the fields with the ones specified in the previous step.
 
-# log the dataset
-rg.log(dataset_rg, "gutenberg_spacy-ner")
+You can also use `pandas` or `load_dataset` to [read an existing dataset and create records from it](https://docs.argilla.io/en/latest/practical_guides/create_update_dataset/records.html#add-records).
+
+```python
+records = [
+    rg.FeedbackRecord(
+        fields={
+            "text": "I am so happy today",
+        },
+    ),
+    rg.FeedbackRecord(
+        fields={
+            "text": "I feel sad today",
+        },
+    )
+]
+dataset.add_records(records)
 ```
-:::
-::::
+
+</div>
diff --git a/docs/_source/_common/tabs/unfication_strategies.md b/docs/_source/_common/tabs/unfication_strategies.md
@@ -9,7 +9,7 @@ dataset = FeedbackDataset.from_huggingface(
     repo_id="argilla/stackoverflow_feedback_demo"
 )
 strategy = LabelQuestionStrategy("majority") # "disagreement", "majority_weighted (WIP)"
-dataset.unify_responses(
+dataset.compute_unified_responses(
     question=dataset.question_by_name("title_question_fit"),
     strategy=strategy,
 )
@@ -28,7 +28,7 @@ dataset = FeedbackDataset.from_huggingface(
     repo_id="argilla/stackoverflow_feedback_demo"
 )
 strategy = MultiLabelQuestionStrategy("majority") # "disagreement", "majority_weighted (WIP)"
-dataset.unify_responses(
+dataset.compute_unified_responses(
     question=dataset.question_by_name("tags"),
     strategy=strategy,
 )
@@ -46,7 +46,7 @@ dataset = FeedbackDataset.from_huggingface(
     repo_id="argilla/stackoverflow_feedback_demo"
 )
 strategy = RankingQuestionStrategy("majority") # "mean", "max", "min"
-dataset.unify_responses(
+dataset.compute_unified_responses(
     question=dataset.question_by_name("relevance_ranking"),
     strategy=strategy,
 )
@@ -64,7 +64,7 @@ dataset = FeedbackDataset.from_huggingface(
     repo_id="argilla/stackoverflow_feedback_demo"
 )
 strategy = RatingQuestionStrategy("majority") # "mean", "max", "min"
-dataset.unify_responses(
+dataset.compute_unified_responses(
     question=dataset.question_by_name("answer_quality"),
     strategy=strategy,
 )

diff --git a/...ource/_static/tutorials/add-text-descriptives-as-metadata/text-descriptives.PNG b/...ource/_static/tutorials/add-text-descriptives-as-metadata/text-descriptives.PNG