You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Manual label for some dataset should be represented as a separate CSV file with the following fields (columns):
datapoint id
label
author
note (optional filed ignored by BOHR but might be useful for annotators)
certainty (from scale 1-5, how sure the annotator is)
showcase (1 if data point represents an interesting case to be discussed or a good example to be included in the paper)
There might be multiple rows corresponding to the same data point from the same author in case multiple labels need to be decided (certainties for different labels can also differ)
Handling of manual labels by BOHR
BOHR should convert manual labels to heuristic, one for each author and certainty level. In this way labels from different authors and different certainties have different weights.
Further features and improvements
once the label hierarchy is updated, provide a mechanism to see which labels can be assigned even more fine-grained label according to the updated hierarchy
Other considerations
Make use of external tools for manual labeling (Prodigy? : https://prodi.gy/)
Make use of active learning
The text was updated successfully, but these errors were encountered:
hlibbabii
changed the title
Label infrastructure improvements
Manual labeling infrastructure improvements
May 22, 2021
hlibbabii
changed the title
Manual labeling infrastructure improvements
Manual labeling as heuristic
May 24, 2021
Representation of manual labels in BOHR:
Manual label for some dataset should be represented as a separate CSV file with the following fields (columns):
There might be multiple rows corresponding to the same data point from the same author in case multiple labels need to be decided (certainties for different labels can also differ)
Handling of manual labels by BOHR
BOHR should convert manual labels to heuristic, one for each author and certainty level. In this way labels from different authors and different certainties have different weights.
Further features and improvements
Other considerations
Make use of external tools for manual labeling (Prodigy? : https://prodi.gy/)
Make use of active learning
The text was updated successfully, but these errors were encountered: