Skip to content

DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.

Notifications You must be signed in to change notification settings

tgalery/dbpedia-spotlight

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DBpedia Spotlight

Shedding Light on the Web of Documents

DBpedia Spotlight looks for ~3.5M things of unknown or ~320 known types in text and tries to link them to their global unique identifiers in DBpedia.

Demonstration

Go to our Demonstration page, copy+paste some text and play with the parameters to see how it works.

Call our web service

You can use our demonstration Web Service directly from your application.

curl http://spotlight.dbpedia.org/rest/annotate \
  --data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break
  for students included in last year's economic stimulus package, arguing
  that the policy provides more generous assistance." \
  --data "confidence=0.2" \
  --data "support=20"

Run your own server

If you need service reliability and lower response times, you can run DBpedia Spotlight in your own In-House Server. Try our automated setup script for dpkg:

wget http://spotlight.sztaki.hu/downloads/dbpedia-spotlight-0.6.deb
dpkg -i dbpedia-spotlight-0.6.deb
#Follow the installation assistant
/usr/bin/dbpedia-spotlight-[language]

Build from source

We provide a Java/Scala API for you to use our code in your application.

Build Status

Licenses

All the original code produced for DBpedia Spotlight is licensed under Apache License, 2.0. Some modules have dependencies on LingPipe under the Royalty Free License. Some of our original code (currently) depends on GPL-licensed or LGPL-licensed code and is therefore also GPL or LGPL, respectively. We are currently cleaning up the dependencies to release two builds, one purely GPL and one purely Apache License, 2.0.

The documentation on this website is shared as Creative Commons Attribution-ShareAlike 3.0 Unported License.

Citation

If you use the current (statistical version) of DBpedia Spotlight, please cite the following paper. This is the version that's available for download here and that powers the demos here.

@inproceedings{isem2013daiber,
  title = {Improving Efficiency and Accuracy in Multilingual Entity Extraction},
  author = {Joachim Daiber and Max Jakob and Chris Hokamp and Pablo N. Mendes},
  year = {2013},
  booktitle = {Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)}
}

If you use the Lucene-based version of DBpedia Spotlight, please cite the following paper:

@inproceedings{isem2011mendesetal,
  title = {DBpedia Spotlight: Shedding Light on the Web of Documents},
  author = {Pablo N. Mendes and Max Jakob and Andres Garcia-Silva and Christian Bizer},
  year = {2011},
  booktitle = {Proceedings of the 7th International Conference on Semantic Systems (I-Semantics)},
  abstract = {Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.}
}

Documentation

More documentation is available from the DBpedia Spotlight wiki.

About

DBpedia Spotlight is a tool for automatically annotating mentions of DBpedia resources in text.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Scala 49.9%
  • Java 46.7%
  • Shell 1.7%
  • Python 1.6%
  • PigLatin 0.1%