-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.txt
19 lines (11 loc) · 1.06 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
The paper submitted to [Elpub 2016] is accessible at
http://dx.doi.org/10.3233/978-1-61499-649-1-105.
The extended versio of this paper is accessible at https://arxiv.org/abs/1611.01820
This paper is about: Identifying and Improving Dataset References in Social Sciences Full Texts
- The suggested approach tries to make explicit links to datasets in papers that have been published already.
- We suggest and evaluate a semi-automatic approach (using the [da|ra repository](http://www.da-ra.de)) for finding references to datasets in social sciences papers.
- It performed well on a small test corpus (gold standard). The mda papers in the 'Source_file' path were used as the test corpus.
- 'ELPub_Corpus_evaluation.xlsx' contains some information about the test corpus and the gold standard.
- A combination of cosine similarity and tf-idf is main part of the sugegsted approach.
- Our approach achieved an F-measure of 0.854 for identifying references in full texts and an F-measure of 0.679 for finding correct
matches of detected references in the da|ra dataset registry.