Skip to content

EditLabels Tutorial

linfrank edited this page Aug 17, 2012 · 1 revision

EditLabels Tutorial

EditLabels is useful for hand editing labels from applying an extraction annotator. A data directory and a labels file created by running ApplyAnnotator with an ExtractionAnnotator are needed. To see how to create an ExtractionAnnotator, look at the TrainExtractor Tutorial. To see how to apply the annotator to a dataset, look at the ApplyAnnotator Tutorial. The extracted type (such as extracted_name) and the true type (such as true_name) also must be known.

For this example, we will use a name annotator and apply it to a directory of data. For a quick reference, here are the commands to do that:

$ java –Xmx500M edu.cmu.minorthird.ui.TrainExtractor –labels sample1.train –spanType trueName –saveAs sample1.ann

and

$ java –Xmx500M edu.cmu.minorthird.ui.ApplyAnnotator –labels DATA_DIR –loadFRom sample1.ann –saveAs sample1.labels

Note: For this example the user must create their own data directory with whatever files they would like. I created a simple data directory with one document that contains this simple text:

Did <trueName>Andrew Carnegie</trueName> found Carnegie Tech?

Feel free to create the same sample or use the annotator on another directory to run through this example.

To run EditLabels, start with:

$ java –Xmx500M edu.cmu.minorthird.ui.EditLabels

Editing Parameters

Like all UI tasks, all the parameters for EditLabels may be specified in either the GUI or by the command line. To use the GUI, simple type –gui on the command line. It is also possible to mix and match where the parameters are specified; for example, one can specify two parameters on the command line and use the GUI to select the rest. For this reason, the step-by-step process for this experiment will first explain how to select a parameter value in the GUI and then how to set the same parameter on the command line.

Note: in this experiment every parameter must be specified.

If you are using the GUI, click the Edit button next to EditLabels to edit the parameters. A Property Editor window will appear:

  • baseParameters:
  • GUI: enter the name of the data directory in the labelsFilename text field.
  • Command Line: Use the –labels option followed by the repository key or the directory of files to load. For this tutorial specify –labels DATA_DIR.
  • editParameters:
  • GUI:
  • editFilename: enter the name of the labels file (the result of the ApplyAnnotator experiment), in this case sample1.labels.
  • extractedType: enter the type that ApplyAnnotator predicted. Note: this type is set in TrainExtractor using the –output option and the default is _prediction.
  • trueType: enter the name of the type that has been hand labeled, in this case trueName.
  • Command Line:
  • Use the –edit option to specify the name of the labels file (the result of the ApplyAnnotator experiment), in this case –edit sample1.labels.
  • Use the –extractedType option to specify what type ApplyAnnotator predicted. Note: this type is set in TrainExtractor using the –output option and the default is _prediction. In this case do -extractedType _prediction.
  • Use the –trueType option to specify the correct hand label, in this case do -trueType trueName.

At this point, press enter on the command line or click OK in the Property Editor in the GUI and press Start Task. A window like this will appear: