Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create responses and suggestions with spans #4623

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
d44be46
chore: Add new span enum type
frascuchon Feb 29, 2024
39c9c6e
feat: define new SpanQuestion and SpanLabelOption classes
frascuchon Feb 29, 2024
28a7ffa
feat: Add 'remote' version for span question
frascuchon Feb 29, 2024
febd300
refactor: Using allowed types from questions modules
frascuchon Feb 29, 2024
3b010bb
chore: Expose new span question classes through feedback module
frascuchon Feb 29, 2024
9759153
refactor: simplify adding questions
frascuchon Feb 29, 2024
5819d87
chore: Expose span question classes from rg
frascuchon Feb 29, 2024
d0ffa74
tests: Adding unint tests
frascuchon Feb 29, 2024
9776e2d
tests: Adding basic integration tests
frascuchon Feb 29, 2024
8cafa01
Using feature branch from argilla server
frascuchon Feb 29, 2024
86f568d
update CHANGELOG
frascuchon Feb 29, 2024
490dec6
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 29, 2024
56f360b
fix import naming
frascuchon Feb 29, 2024
5b638e7
Merge branch 'feat/create-span-question-from-sdk' of github.com:argil…
frascuchon Feb 29, 2024
f337160
Adding a new test
frascuchon Feb 29, 2024
5317900
fix: Adding label description
frascuchon Mar 1, 2024
35f6392
update tests
frascuchon Mar 1, 2024
ceded58
update tests
frascuchon Mar 1, 2024
424a5c8
Merge branch 'feat/create-span-question-from-sdk' into feat/suggest-r…
frascuchon Mar 1, 2024
11a0068
feat: Suggestions schema module including new SpanSuggestion schema
frascuchon Mar 1, 2024
f4bea43
Remove suggestions schemas from records module
frascuchon Mar 1, 2024
53af61b
update suggestion schemas imports
frascuchon Mar 1, 2024
2760348
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 1, 2024
4edf802
change to_server_payload logic
frascuchon Mar 1, 2024
433f27b
Merge branch 'feat/suggest-records-with-spans' of github.com:argilla-…
frascuchon Mar 1, 2024
540979f
feat: Define responses and responses values modules
frascuchon Mar 4, 2024
0899176
refactor: Align response and suggestion value schemas
frascuchon Mar 4, 2024
be02eb2
refactor: Remove response schemas from records module
frascuchon Mar 4, 2024
22f6947
feat: Adding fields attribute for span question
frascuchon Mar 4, 2024
be13a7d
Review imports
frascuchon Mar 4, 2024
84dd077
refactor: Relax value API model constraints API model
frascuchon Mar 4, 2024
63a369f
tests: Update tests
frascuchon Mar 4, 2024
a1c13e9
Merge branch 'feat/span-questions-support' into feat/create-responses…
frascuchon Mar 4, 2024
5831642
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 4, 2024
b39cbb3
change feature branch for argilla server
frascuchon Mar 5, 2024
2e87977
Merge branch 'bugfix/format-suggestions-for-ranking-values' into feat…
frascuchon Mar 5, 2024
5f14b28
add field to span question definition
frascuchon Mar 5, 2024
ca0b649
Merge branch 'bugfix/format-suggestions-for-ranking-values' into feat…
frascuchon Mar 6, 2024
816a839
Merge branch 'feat/span-questions-support' into feat/create-responses…
frascuchon Mar 6, 2024
6a2f636
Merge branch 'bugfix/hf-dataset-remove-rank-list' into feat/create-re…
frascuchon Mar 6, 2024
fd664f4
Merge branch 'feat/span-questions-support' into feat/create-responses…
frascuchon Mar 6, 2024
7f43202
Merge branch 'feat/span-questions-support' into feat/create-responses…
frascuchon Mar 6, 2024
d0b1351
fix: Using schema instances instead of dict for suggestions
frascuchon Mar 6, 2024
a3def08
using argilla-server feature branch
frascuchon Mar 6, 2024
ae329fc
refactor: creating suggestions and responses (#4627)
frascuchon Mar 6, 2024
5b65e11
chore. Update CHANGELOG
frascuchon Mar 6, 2024
af035dc
adding more tests
frascuchon Mar 6, 2024
bb7dba4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2024
b6497ae
feat: Accept a dict for labels
frascuchon Mar 7, 2024
9439624
feat: export import hf dataset with spans (#4636)
frascuchon Mar 7, 2024
c86b20d
fix: add manual validation for min_items
frascuchon Mar 8, 2024
6136414
update tests
frascuchon Mar 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ These are the section headers that we use:
### Added

- Added support for span questions in the Python SDK. ([#4617](https://github.com/argilla-io/argilla/pull/4617))
- Added support for spans values in suggestions and responses. ([#4623](https://github.com/argilla-io/argilla/pull/4623))

### Fixed

Expand Down
50 changes: 45 additions & 5 deletions src/argilla/client/feedback/integrations/huggingface/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import json
import logging
import tempfile
import warnings
from copy import copy
from typing import TYPE_CHECKING, Any, Optional, Type, Union

from packaging.version import parse as parse_version
Expand Down Expand Up @@ -50,7 +50,7 @@ def _huggingface_format(dataset: Union["FeedbackDataset", "RemoteFeedbackDataset
questions, and metadata_properties formatted as `datasets.Features`.

Examples:
>>> from argilla.client.feedback.integrations.dataset import HuggingFaceDatasetMixin
>>> from argilla.client.feedback.integrations.huggingface import HuggingFaceDatasetMixin
>>> dataset = FeedbackDataset(...) or RemoteFeedbackDataset(...)
>>> huggingface_dataset = HuggingFaceDatasetMixin._huggingface_format(dataset)
"""
Expand All @@ -71,17 +71,38 @@ def _huggingface_format(dataset: Union["FeedbackDataset", "RemoteFeedbackDataset
for question in dataset.questions:
if question.type in [QuestionTypes.text, QuestionTypes.label_selection]:
value = Value(dtype="string", id="question")
suggestion_value = copy(value)
elif question.type == QuestionTypes.rating:
value = Value(dtype="int32", id="question")
suggestion_value = copy(value)
elif question.type == QuestionTypes.ranking:
value = Sequence({"rank": Value(dtype="uint8"), "value": Value(dtype="string")}, id="question")
suggestion_value = copy(value)
elif question.type in QuestionTypes.multi_label_selection:
value = Sequence(Value(dtype="string"), id="question")
suggestion_value = copy(value)
elif question.type in QuestionTypes.span:
value = Sequence(
{
"start": Value(dtype="int32"),
"end": Value(dtype="int32"),
"label": Value(dtype="string"),
},
id="question",
)
suggestion_value = Sequence(
{
"start": Value(dtype="int32"),
"end": Value(dtype="int32"),
"label": Value(dtype="string"),
"score": Value(dtype="float32"),
}
)
else:
raise ValueError(
f"Question {question.name} is of type `{question.type}`,"
" for the moment only the following question types are supported:"
f" `{'`, `'.join([arg.value for arg in QuestionTypes])}`."
f" `{'`, `'.join(QuestionTypes.values())}`."
)

hf_features[question.name] = [
Expand All @@ -94,8 +115,8 @@ def _huggingface_format(dataset: Union["FeedbackDataset", "RemoteFeedbackDataset
if question.name not in hf_dataset:
hf_dataset[question.name] = []

value.id = "suggestion"
hf_features[f"{question.name}-suggestion"] = value
suggestion_value.id = "suggestion"
hf_features[f"{question.name}-suggestion"] = suggestion_value
if f"{question.name}-suggestion" not in hf_dataset:
hf_dataset[f"{question.name}-suggestion"] = []

Expand Down Expand Up @@ -138,6 +159,15 @@ def _huggingface_format(dataset: Union["FeedbackDataset", "RemoteFeedbackDataset
}
if question.type == QuestionTypes.ranking:
value = [r.dict() for r in response.values[question.name].value]
elif question.type == QuestionTypes.span:
value = [
{
"start": span.start,
"end": span.end,
"label": span.label,
}
for span in response.values[question.name].value
]
else:
value = response.values[question.name].value
formatted_response["value"] = value
Expand Down Expand Up @@ -421,6 +451,11 @@ def from_huggingface(
if value is not None:
if question.type == QuestionTypes.ranking:
value = [{"rank": r, "value": v} for r, v in zip(value["rank"], value["value"])]
elif question.type == QuestionTypes.span:
value = [
{"start": s, "end": e, "label": l}
for s, e, l in zip(value["start"], value["end"], value["label"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in terms of human-readability, I also like to add the extracted text normally. Something likevalue["text"][value["start"]:value["end"]] but perhaps this is difficult top map back into the correct format when calling from_huggingface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not hard to add, if it makes sense.

]
responses[user_id or "user_without_id"]["values"].update({question.name: {"value": value}})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should add else statements here to be sure it raises errors/has behaviour when we add question types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the current approach to support span values mappings. I wouldn't add cross-cutting solution in the current version of the SDK. This is something that we can add for new SDK implementation


# First if-condition is here for backwards compatibility
Expand All @@ -431,6 +466,11 @@ def from_huggingface(
value = hfds[index][f"{question.name}-suggestion"]
if question.type == QuestionTypes.ranking:
value = [{"rank": r, "value": v} for r, v in zip(value["rank"], value["value"])]
elif question.type == QuestionTypes.span:
value = [
{"start": s, "end": e, "label": l}
for s, e, l in zip(value["start"], value["end"], value["label"])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in terms of human-readability, I also like to add the extracted text normally. Something likevalue["text"][value["start"]:value["end"]] but perhaps this is difficult top map back into the correct format when calling from_huggingface?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not hard to add, if it makes sense.

]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we should add else statements here to be sure it raises errors/has behaviour when we add question types?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the current approach to support span values mappings. I wouldn't add cross-cutting solution in the current version of the SDK. This is something that we can add for new SDK implementation


suggestion = {"question_name": question.name, "value": value}
if hfds[index][f"{question.name}-suggestion-metadata"] is not None:
Expand Down
2 changes: 1 addition & 1 deletion src/argilla/client/feedback/metrics/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ def get_unified_responses_and_suggestions(
unified_responses = [
tuple(ranking_schema.rank for ranking_schema in response) for response in unified_responses
]
suggestions = [tuple(s["rank"] for s in suggestion) for suggestion in suggestions]
suggestions = [tuple(s.rank for s in suggestion) for suggestion in suggestions]

return unified_responses, suggestions

Expand Down
17 changes: 8 additions & 9 deletions src/argilla/client/feedback/schemas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,10 @@
SpanQuestion,
TextQuestion,
)
from argilla.client.feedback.schemas.records import (
FeedbackRecord,
RankingValueSchema,
ResponseSchema,
SortBy,
SuggestionSchema,
ValueSchema,
)
from argilla.client.feedback.schemas.records import FeedbackRecord, SortBy
from argilla.client.feedback.schemas.response_values import RankingValueSchema, ResponseValue, SpanValueSchema
from argilla.client.feedback.schemas.responses import ResponseSchema, ResponseStatus, ValueSchema
from argilla.client.feedback.schemas.suggestions import SuggestionSchema
from argilla.client.feedback.schemas.vector_settings import VectorSettings

__all__ = [
Expand All @@ -67,10 +63,13 @@
"SpanQuestion",
"SpanLabelOption",
"FeedbackRecord",
"RankingValueSchema",
"ResponseSchema",
"ResponseValue",
"ResponseStatus",
"SuggestionSchema",
"ValueSchema",
"RankingValueSchema",
"SpanValueSchema",
"SortOrder",
"SortBy",
"RecordSortField",
Expand Down
31 changes: 29 additions & 2 deletions src/argilla/client/feedback/schemas/questions.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@
from typing import Any, Dict, List, Literal, Optional, Union

from argilla.client.feedback.schemas.enums import QuestionTypes
from argilla.client.feedback.schemas.response_values import parse_value_response_for_question
from argilla.client.feedback.schemas.responses import ResponseValue, ValueSchema
from argilla.client.feedback.schemas.suggestions import SuggestionSchema
from argilla.client.feedback.schemas.utils import LabelMappingMixin
from argilla.client.feedback.schemas.validators import title_must_have_value
from argilla.pydantic_v1 import BaseModel, Extra, Field, conint, conlist, root_validator, validator
Expand Down Expand Up @@ -77,6 +80,16 @@ def to_server_payload(self) -> Dict[str, Any]:
"settings": self.server_settings,
}

def suggestion(self, value: ResponseValue, **kwargs) -> SuggestionSchema:
"""Method that will be used to create a `SuggestionSchema` from the question and a suggested value."""
value = parse_value_response_for_question(self, value)
return SuggestionSchema(question_name=self.name, value=value, **kwargs)

def response(self, value: ResponseValue) -> Dict[str, ValueSchema]:
"""Method that will be used to create a response from the question and a value."""
value = parse_value_response_for_question(self, value)
return {self.name: ValueSchema(value=value)}


class TextQuestion(QuestionSchema):
"""Schema for the `FeedbackDataset` text questions, which are the ones that will
Expand Down Expand Up @@ -334,21 +347,35 @@ class SpanQuestion(QuestionSchema):

Examples:
>>> from argilla.client.feedback.schemas.questions import SpanQuestion
>>> SpanQuestion(name="span_question", title="Span Question", labels=["person", "org"])
>>> SpanQuestion(name="span_question", field="prompt", title="Span Question", labels=["person", "org"])
"""

type: Literal[QuestionTypes.span] = Field(QuestionTypes.span, allow_mutation=False, const=True)

labels: conlist(Union[str, SpanLabelOption], min_items=1, unique_items=True)
field: str = Field(..., description="The field in the input that the user will be asked to annotate.")
labels: Union[Dict[str, str], conlist(Union[str, SpanLabelOption], min_items=1, unique_items=True)]

@validator("labels", pre=True)
def parse_labels_dict(cls, labels) -> List[SpanLabelOption]:
if isinstance(labels, dict):
return [SpanLabelOption(value=label, text=text) for label, text in labels.items()]
return labels

@validator("labels", always=True)
def normalize_labels(cls, v: List[Union[str, SpanLabelOption]]) -> List[SpanLabelOption]:
return [SpanLabelOption(value=label, text=label) if isinstance(label, str) else label for label in v]

@validator("labels")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't there a max for the labels too which we defined on the server side? perhaps we can use it here as well?

Copy link
Member Author

@frascuchon frascuchon Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not hard to implement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, this is not the current behaviour that exists for single and multi-label settings. In both cases, there is a min validation but not a max one.

Also, if we plan to support a configurable value for this, adding a hard validation here may introduce workflow problems. So, I will let as is.

def labels_must_be_valid(cls, labels: List[SpanLabelOption]) -> List[SpanLabelOption]:
# This validator is needed since the conlist constraint does not work.
assert len(labels) > 0, "At least one label must be provided"
return labels

@property
def server_settings(self) -> Dict[str, Any]:
return {
"type": self.type,
"field": self.field,
"options": [label.dict() for label in self.labels],
}

Expand Down
128 changes: 10 additions & 118 deletions src/argilla/client/feedback/schemas/records.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,130 +13,21 @@
# limitations under the License.

import warnings
from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional, Tuple, Union
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Union
from uuid import UUID

from argilla.client.feedback.schemas.enums import RecordSortField, ResponseStatus, SortOrder
from argilla.pydantic_v1 import BaseModel, Extra, Field, PrivateAttr, StrictInt, StrictStr, conint, validator
from argilla.client.feedback.schemas.enums import RecordSortField, SortOrder

# Support backward compatibility for import of RankingValueSchema from records module
from argilla.client.feedback.schemas.response_values import RankingValueSchema # noqa
from argilla.client.feedback.schemas.responses import ResponseSchema, ValueSchema # noqa
from argilla.client.feedback.schemas.suggestions import SuggestionSchema
from argilla.pydantic_v1 import BaseModel, Extra, Field, PrivateAttr, validator

if TYPE_CHECKING:
from argilla.client.feedback.unification import UnifiedValueSchema


class RankingValueSchema(BaseModel):
"""Schema for the `RankingQuestion` response value for a `RankingQuestion`. Note that
we may have more than one record in the same rank.

Args:
value: The value of the record.
rank: The rank of the record.
"""

value: StrictStr
rank: Optional[conint(ge=1)] = None


class ValueSchema(BaseModel):
"""Schema for any `FeedbackRecord` response value.

Args:
value: The value of the record.
"""

value: Union[StrictStr, StrictInt, List[str], List[RankingValueSchema]]


class ResponseSchema(BaseModel):
"""Schema for the `FeedbackRecord` response.

Args:
user_id: ID of the user that provided the response. Defaults to None, and is
automatically fulfilled internally once the question is pushed to Argilla.
values: Values of the response, should match the questions in the record.
status: Status of the response. Defaults to `submitted`.

Examples:
>>> from argilla.client.feedback.schemas.records import ResponseSchema
>>> ResponseSchema(
... values={
... "question_1": {"value": "answer_1"},
... "question_2": {"value": "answer_2"},
... }
... )
"""

user_id: Optional[UUID] = None
values: Union[Dict[str, ValueSchema], None]
status: ResponseStatus = ResponseStatus.submitted

class Config:
extra = Extra.forbid
validate_assignment = True

@validator("user_id", always=True)
def user_id_must_have_value(cls, v):
if not v:
warnings.warn(
"`user_id` not provided, so it will be set to `None`. Which is not an"
" issue, unless you're planning to log the response in Argilla, as"
" it will be automatically set to the active `user_id`.",
)
return v

def to_server_payload(self) -> Dict[str, Any]:
"""Method that will be used to create the payload that will be sent to Argilla
to create a `ResponseSchema` for a `FeedbackRecord`."""
return {
# UUID is not json serializable!!!
"user_id": self.user_id,
"values": {question_name: value.dict() for question_name, value in self.values.items()}
if self.values is not None
else None,
"status": self.status.value if hasattr(self.status, "value") else self.status,
}


class SuggestionSchema(BaseModel):
"""Schema for the suggestions for the questions related to the record.

Args:
question_name: name of the question in the `FeedbackDataset`.
type: type of the question. Defaults to None. Possible values are `model` or `human`.
score: score of the suggestion. Defaults to None.
value: value of the suggestion, which should match the type of the question.
agent: agent that generated the suggestion. Defaults to None.

Examples:
>>> from argilla.client.feedback.schemas.records import SuggestionSchema
>>> SuggestionSchema(
... question_name="question-1",
... type="model",
... score=0.9,
... value="This is the first suggestion",
... agent="agent-1",
... )
"""

question_name: str
type: Optional[Literal["model", "human"]] = None
score: Optional[float] = None
value: Any
agent: Optional[str] = None

class Config:
extra = Extra.forbid
validate_assignment = True

def to_server_payload(self, question_name_to_id: Dict[str, UUID]) -> Dict[str, Any]:
"""Method that will be used to create the payload that will be sent to Argilla
to create a `SuggestionSchema` for a `FeedbackRecord`."""
# We can do this because there is no default values for the fields
payload = self.dict(exclude_unset=True, include={"type", "score", "value", "agent"})
payload["question_id"] = str(question_name_to_id[self.question_name])

return payload


class FeedbackRecord(BaseModel):
"""Schema for the records of a `FeedbackDataset`.

Expand All @@ -159,7 +50,7 @@ class FeedbackRecord(BaseModel):
Defaults to None.

Examples:
>>> from argilla.client.feedback.schemas.records import FeedbackRecord, ResponseSchema, SuggestionSchema, ValueSchema
>>> from argilla.feedback import FeedbackRecord, ResponseSchema, SuggestionSchema, ValueSchema
>>> FeedbackRecord(
... fields={"text": "This is the first record", "label": "positive"},
... metadata={"first": True, "nested": {"more": "stuff"}},
Expand All @@ -181,6 +72,7 @@ class FeedbackRecord(BaseModel):
... value="This is the first suggestion",
... agent="agent-1",
... ),
... ],
... external_id="entry-1",
... )

Expand Down
Loading
Loading