Refactoring of the local metrics #94

liwii · 2024-03-02T16:33:43Z

Refactored (mainly) local metrics code. The changes include:

For reference_free metrics with Detoxify / AutoModelForSequenceClassification, created base classes and made logics such as tqdm progress bar common. For these metrics I also added an argument local_overflow_strategy, with which you can handle the input longer than the model input window in 3 different ways (truncate, nullify, raise)
For semantic similarity (both local and openai), made a base class and put most of the logic into that class.
For non-english factual consisntency, long inputs always caused error, so avoided it by adding truncation=True argument. (I would like to add the overflow handling to the factual consietency metrics too, but it should be huge enough to develop in another PR)
Moved loading logics of many models into model_manager. check_model_availability became flaky as a result, so I disabled these checks for now.

liwii · 2024-03-06T14:28:40Z

@yosukehigashi
The PR became much bigger than I thought, but I would be happy if you could take a look when you have time!!

yosukehigashi

Wow what a beautiful refactor 🤩 Left a bunch of comments but generally LGTM!

Let's do a sanity check with LangCheckChat as well since this is a pretty big change. Could you test out English and Japanese, and I'll do Chinese and German?

src/langcheck/metrics/scorer/_base.py

yosukehigashi · 2024-03-08T11:11:31Z

src/langcheck/metrics/scorer/hf_models.py

+    '''Scorer using Hugging Face's AutoModelForSequenceClassification.
+    '''
+
+    def __init__(self,


Can you add a docstring here? E.g. we should explain the allowed overflow strategies and what they do

Added a simple docstring there

src/langcheck/metrics/scorer/hf_models.py

src/langcheck/metrics/de/_translation.py

src/langcheck/metrics/de/reference_based_text_quality.py

src/langcheck/metrics/en/reference_based_text_quality.py

src/langcheck/metrics/en/reference_free_text_quality.py

src/langcheck/metrics/ja/reference_free_text_quality.py

liwii · 2024-03-09T07:05:12Z

Thanks for the review! Let me resolve the comments first, then we can do the sanity-check together as you say!

liwii · 2024-03-18T04:26:59Z

@yosukehigashi
Does the change look good to you? If so, we can start testing the new version.

yosukehigashi · 2024-03-18T04:31:56Z

@yosukehigashi Does the change look good to you? If so, we can start testing the new version.

yeah looks fine to me! can you start testing English and Japanese then? I need to make some code changes to test German / Chinese

liwii · 2024-03-18T06:35:54Z

Tried some examples of Japanese & English with references, there were no errors at least.

yosukehigashi

Chinese and German work fine too. LGTM!

liwii added 18 commits March 1, 2024 12:12

Add the base single scorer

5a9cd5b

Add a class for HF AutoModelForSequenceClassification

eaf537e

Calculate the fluency score with the base class

01e3337

Merge branch 'main' into cleanup-local-metrics

2496288

Suppress some warnings

ef1b722

Add overflow storategies

082d7a8

Refactor japanese reference free metrics

3feaad9

Remove unused imports

dd288cb

update en sentiment

c4cee71

update de sentiment

05af2e2

Update comments

c7274ef

Use scorer for detoxify models

e9adea6

Rename the module

5625add

Add scorer for semantic similarity

7851a06

Skip the model availability checks for now

7260cd5

Use scorer to compute semantic similarity

84a28c1

Change the default strategy to truncate

aaec7e0

Add truncation to translation pipelines

3f57f9c

liwii changed the title ~~[WIP] Overflow handling of the local metrics~~ Overflow handling of the local metrics Mar 6, 2024

liwii changed the title ~~Overflow handling of the local metrics~~ Refactoring of the local metrics Mar 6, 2024

liwii added 2 commits March 6, 2024 14:18

Remove unused imports

52aeb2a

Avoid flake8 errors

185d0be

liwii marked this pull request as ready for review March 6, 2024 14:27

liwii requested a review from yosukehigashi March 6, 2024 14:28

yosukehigashi reviewed Mar 8, 2024

View reviewed changes

liwii added 3 commits March 11, 2024 12:34

Edit comments [no ci]

91cc9b8

Rename variables

ef432c7

Update docstrings [no ci]

3017c5e

liwii added 7 commits March 11, 2024 15:24

Move comments about the selected models to the config yaml file [no ci]

b3a851b

Add descriptions of class weights [no ci]

7b3c20b

Add the removed validation back [no ci]

208526d

Add run_check_model_availability

d651db9

Merge branch 'main' into cleanup-local-metrics

a9babbf

Merge branch 'main' into cleanup-local-metrics

e6d8aa0

Format

ecce4df

yosukehigashi approved these changes Mar 18, 2024

View reviewed changes

liwii merged commit 7b8d73f into main Mar 18, 2024
14 checks passed

liwii deleted the cleanup-local-metrics branch March 18, 2024 09:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring of the local metrics #94

Refactoring of the local metrics #94

liwii commented Mar 2, 2024 •

edited

Loading

liwii commented Mar 6, 2024

yosukehigashi left a comment

yosukehigashi Mar 8, 2024

liwii Mar 11, 2024

liwii commented Mar 9, 2024

liwii commented Mar 18, 2024

yosukehigashi commented Mar 18, 2024

liwii commented Mar 18, 2024

yosukehigashi left a comment

Refactoring of the local metrics #94

Refactoring of the local metrics #94

Conversation

liwii commented Mar 2, 2024 • edited Loading

liwii commented Mar 6, 2024

yosukehigashi left a comment

Choose a reason for hiding this comment

yosukehigashi Mar 8, 2024

Choose a reason for hiding this comment

liwii Mar 11, 2024

Choose a reason for hiding this comment

liwii commented Mar 9, 2024

liwii commented Mar 18, 2024

yosukehigashi commented Mar 18, 2024

liwii commented Mar 18, 2024

yosukehigashi left a comment

Choose a reason for hiding this comment

liwii commented Mar 2, 2024 •

edited

Loading