diff --git a/topics/entity-resolution/index.md b/topics/entity-resolution/index.md new file mode 100644 index 00000000000..c654c112b57 --- /dev/null +++ b/topics/entity-resolution/index.md @@ -0,0 +1,12 @@ +--- +aliases: entity-matching, entity-linking, link-discovery, deduplication, de-duplication, data-matching, record-linkage, data-disambigation +created_by: Halbert L. Dunn +display_name: Entity resolution +released: 1946 +short_description: Entity Resolution is the task of detecting different entity profiles that describe the same real-world objects. +topic: entity-resolution +related: artificial-intelligence, nlp +github_url: https://github.com/entity-resolution +wikipedia_url: https://en.wikipedia.org/wiki/Record_linkage +--- +Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference. \ No newline at end of file