diff --git a/data/xml/2024.acl.xml b/data/xml/2024.acl.xml
index 064bde9f9b..74ab2f5c3b 100644
--- a/data/xml/2024.acl.xml
+++ b/data/xml/2024.acl.xml
@@ -12754,9 +12754,11 @@
BrianThompsonAmazon
488-500
We introduce a new, extensive multidimensional quality metrics (MQM) annotated dataset covering 11 language pairs in the biomedical domain. We use this dataset to investigate whether machine translation (MT) metrics which are fine-tuned on human-generated MT quality judgements are robust to domain shifts between training and inference. We find that fine-tuned metrics exhibit a substantial performance drop in the unseen domain scenario relative to both metrics that rely on the surface form and pre-trained metrics that are not fine-tuned on MT quality judgments.
- 2024.acl-short.45
+ 2024.acl-short.45
zouhar-etal-2024-fine
10.18653/v1/2024.acl-short.45
+
+ Adds the missing legend and axis labels on Figure 2.
IndicIRSuite: Multilingual Dataset and Neural Information Models for Indian Languages