Skip to content

calculating reliability of vtc against all human annotations together #377

Answered by alecristia
alecristia asked this question in Q&A
Discussion options

You must be logged in to vote

What I ended up doing is to get the vtc-human agreement for every human, and the human-human agreement. Then in the paper, I reported the weighted average F-score of VTC-human (ie giving more weight to coders that had coded more data -- so NOT based on how much their annotations would overlap) as well as the weighted average F-score of human-human. So this allows us to answer the question of how much more or less accurate VTC is than the humans who have done the coding, compared to other humans who have done the coding.

Replies: 6 comments 3 replies

Comment options

You must be logged in to vote
2 replies
@alecristia
Comment options

alecristia Jun 14, 2022
Maintainer Author

@alecristia
Comment options

alecristia Jun 14, 2022
Maintainer Author

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@lucasgautheron
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by alecristia
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants