diff --git a/data/xml/2024.acl.xml b/data/xml/2024.acl.xml index 064bde9f9b..74ab2f5c3b 100644 --- a/data/xml/2024.acl.xml +++ b/data/xml/2024.acl.xml @@ -12754,9 +12754,11 @@ BrianThompsonAmazon 488-500 We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain. We use this dataset to investigate whether machine translation (MT) metrics which are fine-tuned on human-generated MT quality judgements are robust to domain shifts between training and inference. We find that fine-tuned metrics exhibit a substantial performance drop in the unseen domain scenario relative to both metrics that rely on the surface form and pre-trained metrics that are not fine-tuned on MT quality judgments. - 2024.acl-short.45 + 2024.acl-short.45 zouhar-etal-2024-fine 10.18653/v1/2024.acl-short.45 + + Adds the missing legend and axis labels on Figure 2. <fixed-case>I</fixed-case>ndic<fixed-case>IRS</fixed-case>uite: Multilingual Dataset and Neural Information Models for <fixed-case>I</fixed-case>ndian Languages