Code and issue tracker for WMT19 source-based DA campaigns
- 20221202: This repository has now been archived
- 20190527: Accounts are live now, please start your annotation work
- 20190520: WMT19.appraise.cf goes live, accounts shared with teams
- 20190519: clarified annotation requirements, updated timeline
- 20190517: created tasks and associated accounts
- WMT19 will feature source-based direct assessment (DA)
- Evaluation will be based on document-level annotation
- Source-based DA will focus on non-English target languages
- Please use the Github issue tracker to report any problems
- 5/09: eval plan online on Github
- 5/15--
5/165/20: research teams request accounts - 5/20: accounts shared with research teams
5/175/27: annotation starts5/276/10: annotation ends
Annotations will be collected in Appraise, implementing document-level, source-based direct assessment. For every language pair, there will be a pre-generated amount of annotations tasks (``HITs''). We also generate anonymised accounts which are pre-assigned to exactly two annotation tasks. Based on previous WMT evaluation campaigns, the average annotation time for two tasks is one hour. So, every annotation account maps to one hour of work.
Based on previous WMT evaluation campaigns, the average annotation time for one task is 30 minutes. One task involves approximately 100 judgements.
Each research team is expected to contribute eight hours of annotation work per primary system submission. You can allocate those work hours either
- in this campaign (the instructions you are looking at right now)
- or in our reference-based evaluation campaign on Turkle, run by Matt Post and Mathias Mueller: https://github.com/bricksdont/WMT19RefDA
This translates to 16 completed tasks in total, on Turkle or Appraise, per primary system.
The source-based evaluation campaign is run for non-English target languages. This means that we look for native speakers of the non-English target language who are also proficient in English and can assess translation quality from English into their native language.
Account distribution is based on first-come, first-served basis, i.e., once an account is marked as ``assigned'' to a team you cannot claim it anymore. Each research team should designate a team leader who is then responsible to reserve the required amount of accounts for their team.
TBA
For each account, we provide a single sign-on (SSO) URL. This allows you to sign into Appraise with a single click on the URL, making access very easy.
There is no personal information attached to the annotation accounts. We capture your assessments and related metadata, such as annotation start and end time as well as duration per single assessment.
TBA
Please use the Github issue tracker
to report any problems. You can also contact me via chrife [at] microsoft [dot] com
.