Abstract
Entity disambiguation is the task of mapping ambiguous terms in natural-language text to its entities in a knowledge base. Most disambiguation systems focus on general purpose knowledge bases like DBpedia but leave out the question how those results generalize to more specialized domains. This is very important in the context of Linked Open Data, which forms an enormous resource for disambiguation. We implement a ranking-based (Learning To Rank) disambiguation system and provide a systematic evaluation of biomedical entity disambiguation with respect to three crucial and well-known properties of specialized disambiguation systems. These are (i) entity context, i.e. the way entities are described, (ii) user data, i.e. quantity and quality of externally disambiguated entities, and (iii) quantity and heterogeneity of entities to disambiguate, i.e. the number and size of different domains in a knowledge base. Our results show that (i) the choice of entity context that is used to attain the best disambiguation results strongly depends on the amount of available user data, (ii) disambiguation results with large-scale and heterogeneous knowledge bases strongly depend on the entity context, (iii) disambiguation results are robust against a moderate amount of noise in user data and (iv) some results can be significantly improved with a federated disambiguation approach that uses different entity contexts. Our results indicate that disambiguation systems must be carefully adapted when expanding their knowledge bases with special domain entities.
Original language | English |
---|---|
Title of host publication | Database and Expert Systems Applications |
Subtitle of host publication | 26th International Conference, DEXA 2015, Valencia, Spain, September 1-4, 2015, Proceedings, Part I |
Editors | Qiming Chen |
Place of Publication | Cham |
Publisher | Springer |
Pages | 76-93 |
Number of pages | 18 |
ISBN (Electronic) | 978-3-319-22849-5 |
ISBN (Print) | 978-3-319-22848-8 |
DOIs | |
Publication status | Published - 2015 |
Externally published | Yes |
Event | 26th International Conference on Database and Expert Systems Applications, DEXA 2015 - Valencia, Spain Duration: 1 Sept 2015 → 4 Sept 2015 Conference number: 26 http://www.dexa.org/previous/dexa2015/ |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 9261 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 26th International Conference on Database and Expert Systems Applications, DEXA 2015 |
---|---|
Abbreviated title | DEXA |
Country/Territory | Spain |
City | Valencia |
Period | 1/09/15 → 4/09/15 |
Internet address |
Keywords
- Entity disambiguation
- Learning to rank
- Linked data
- Semantic web
- n/a OA procedure