GitHub - Wafaa014/context-utilization: Measuring context utilization of document-level machine translation systems

This code describes the expperiments performed in th paper "On Measuring Context Utilization in Document-Level MT Systems"

Perturbation Analysis

The script in run.sh includes commands to:

train a context-aware bilingual transformer model using the concatenation or multi-encoder setup
translate test data with options for the amount and type (correct/ random) of context to use
Get BLEU and COMET scores for the translations
Get contrastive accuracy on ContraPro data

Attribution analysis

The script attribute.sh includes commands to get attribution scores for antecedent, current and context tokens on the ContraPro and SCAT data. the demo.yaml file can be used to configure the options: (multi-encoder model, checkpoint directory, data directory)

References

This code is adapted from the following repositories:

1- contextual-mt

2- stopes

3- fairseq

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Perturbation Analysis

Attribution analysis

References

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
contextual-mt		contextual-mt
fairseq		fairseq
stopes		stopes
.DS_Store		.DS_Store
README.md		README.md
attribute.sh		attribute.sh
run.sh		run.sh

Wafaa014/context-utilization

Folders and files

Latest commit

History

Repository files navigation

Perturbation Analysis

Attribution analysis

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages