Skip to content
Christian Federmann edited this page Sep 26, 2017 · 7 revisions

Appraise Evaluation System

Initial import into GitHub on Oct 23, 2011.

Update

WMT14 Notes

A new release of the Appraise software package is currently being prepared in time for the Seventh MT Marathon 2012 which will take place September 3-8, 2012 in Edinburgh, Scotland.

Overview

Appraise is an open-source tool for manual evaluation of Machine Translation output. Appraise allows to collect human judgments on translation output, implementing annotation tasks such as

  1. translation quality checking;
  2. ranking of translations;
  3. error classification;
  4. manual post-editing.

It features an extensible XML import/output format and can easily be adapted to new annotation tasks. The next version of Appraise will also include automatic computation of inter-annotator agreements allowing quick access to evaluation results.

Appraise is available under an open, BSD-style license.

How does it look like?

You can see a deployed version of Appraise here. If you want to play around with it, you will need an account in order to login to the system. I'll be happy to create an account for you, just drop me an email cfedermann [at] gmail [dot] com.

System Requirements

Appraise is based on the Django framework, version 1.3 or newer. You will need Python 2.7 to run it locally. For deployment, a FastCGI compatible web server such as lighttpd is required.

Quickstart Instructions

Assuming you have already installed Python and Django, you can clone a local copy of Appraise using the following command; you can change the folder name Appraise-Software to anything you like.

$ git clone git://github.com/cfedermann/Appraise.git Appraise-Software
...

After having cloned the GitHub project, you have to initialise Appraise. This is a two-step process:

  1. Initialise the SQLite database:

     $ cd Appraise-Software/appraise
     $ python manage.py syncdb
     ...
    
  2. Collect static files and copy them into Appraise-Software/appraise/static-files. Answer yes when asked whether you want to overwrite existing files.

     $ python manage.py collectstatic
     ...
    

    More information on handling of static files in Django 1.3+ is available here.

Finally, you can start up your local copy of Django using the runserver command:

$ python manage.py runserver

You should be greeted with the following output from your terminal:

Validating models...

0 errors found
Django version 1.3.1, using settings 'appraise.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.

Point your browser to http://127.0.0.1:8000/appraise/ and there it is...

Add users

Users can be added here.

Add evaluation tasks

Evaluation tasks can be created here.

You need an XML file in proper format to upload a task; an example file can be found in examples/sample-ranking-task.xml .

Deployment with lighttpd

You will need to create a customised start-server.sh script inside Appraise-Software/appraise. There is a .sample file available in this folder which should help you get started quickly. In a nutshell, you have to uncomment and edit the last two lines:

# /path/to/bin/python manage.py runfcgi host=127.0.0.1 port=1234 method=threaded pidfile=$DJANGO_PID

The first line tells Django to start up in FastCGI mode, binding to hostname 127.0.0.1 and port 1234 in our example, running a threaded server and writing the process ID to the file $DJANGO_PID. The .pid files will be used by stop-server.sh to properly shutdown Appraise.

Using Django's manage.py with the runfcgi command requires you to also install flup into the site-packages folder of your Python installation. It is available from here.

# /path/to/sbin/lighttpd -f /path/to/lighttpd/etc/appraise.conf

The second line starts up the lighttd server using an appropriate configuration file appraise.conf. Have a look at Appraise-Software/examples/appraise-lighttpd.conf to create your own.

Once the various /path/to/XYZ settings are properly configured, you should be able to launch Appraise in production mode.

References

Christian Federmann Appraise: An Open-Source Toolkit for Manual Evaluation of Machine Translation Output Submitted to MT Marathon 2012 (forthcoming)

Christian Federmann Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations In Proceedings of the Seventh Conference on International Language Resources and Evaluation, Valletta, Malta, LREC, 5/2010