From general to specialized domain: Analyzing three crucial problems of biomedical entity disambiguation

Stefan Zwicklbauer, Christin Seifert, Michael Granitzer

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

4 Citations (Scopus)
7 Downloads (Pure)

Abstract

Entity disambiguation is the task of mapping ambiguous terms in natural-language text to its entities in a knowledge base. Most disambiguation systems focus on general purpose knowledge bases like DBpedia but leave out the question how those results generalize to more specialized domains. This is very important in the context of Linked Open Data, which forms an enormous resource for disambiguation. We implement a ranking-based (Learning To Rank) disambiguation system and provide a systematic evaluation of biomedical entity disambiguation with respect to three crucial and well-known properties of specialized disambiguation systems. These are (i) entity context, i.e. the way entities are described, (ii) user data, i.e. quantity and quality of externally disambiguated entities, and (iii) quantity and heterogeneity of entities to disambiguate, i.e. the number and size of different domains in a knowledge base. Our results show that (i) the choice of entity context that is used to attain the best disambiguation results strongly depends on the amount of available user data, (ii) disambiguation results with large-scale and heterogeneous knowledge bases strongly depend on the entity context, (iii) disambiguation results are robust against a moderate amount of noise in user data and (iv) some results can be significantly improved with a federated disambiguation approach that uses different entity contexts. Our results indicate that disambiguation systems must be carefully adapted when expanding their knowledge bases with special domain entities.
Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications
Subtitle of host publication26th International Conference, DEXA 2015, Valencia, Spain, September 1-4, 2015, Proceedings, Part I
EditorsQiming Chen
Place of PublicationCham
PublisherSpringer
Pages76-93
Number of pages18
ISBN (Electronic)978-3-319-22849-5
ISBN (Print)978-3-319-22848-8
DOIs
Publication statusPublished - 2015
Externally publishedYes
Event26th International Conference on Database and Expert Systems Applications, DEXA 2015 - Valencia, Spain
Duration: 1 Sept 20154 Sept 2015
Conference number: 26
http://www.dexa.org/previous/dexa2015/

Publication series

NameLecture Notes in Computer Science
PublisherSpringer
Volume9261
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Database and Expert Systems Applications, DEXA 2015
Abbreviated titleDEXA
Country/TerritorySpain
CityValencia
Period1/09/154/09/15
Internet address

Keywords

  • Entity disambiguation
  • Learning to rank
  • Linked data
  • Semantic web
  • n/a OA procedure

Fingerprint

Dive into the research topics of 'From general to specialized domain: Analyzing three crucial problems of biomedical entity disambiguation'. Together they form a unique fingerprint.

Cite this