Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction

Mena Badieh Habib, Maurice van Keulen

Research output: Book/ReportReportProfessional

17 Downloads (Pure)

Abstract

Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.
Original languageUndefined
Place of PublicationEnschede
PublisherCentre for Telematics and Information Technology (CTIT)
Number of pages8
Publication statusPublished - Dec 2011

Publication series

NameCTIT Technical Report Series
PublisherUniversity of Twente, Centre for Telematics and Information Technology (CTIT)
No.TR-CTIT-11-29
ISSN (Print)1381-3625

Keywords

  • IR-78964
  • METIS-281655
  • EWI-21023
  • Uncertain Annotations
  • Named Entity Disambiguation
  • Named Entity Extraction

Cite this

Habib, M. B., & van Keulen, M. (2011). Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction. (CTIT Technical Report Series; No. TR-CTIT-11-29). Enschede: Centre for Telematics and Information Technology (CTIT).
Habib, Mena Badieh ; van Keulen, Maurice. / Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction. Enschede : Centre for Telematics and Information Technology (CTIT), 2011. 8 p. (CTIT Technical Report Series; TR-CTIT-11-29).
@book{e606e3dd09d941719e800a314be2ff1e,
title = "Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction",
abstract = "Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.",
keywords = "IR-78964, METIS-281655, EWI-21023, Uncertain Annotations, Named Entity Disambiguation, Named Entity Extraction",
author = "Habib, {Mena Badieh} and {van Keulen}, Maurice",
year = "2011",
month = "12",
language = "Undefined",
series = "CTIT Technical Report Series",
publisher = "Centre for Telematics and Information Technology (CTIT)",
number = "TR-CTIT-11-29",
address = "Netherlands",

}

Habib, MB & van Keulen, M 2011, Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction. CTIT Technical Report Series, no. TR-CTIT-11-29, Centre for Telematics and Information Technology (CTIT), Enschede.

Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction. / Habib, Mena Badieh; van Keulen, Maurice.

Enschede : Centre for Telematics and Information Technology (CTIT), 2011. 8 p. (CTIT Technical Report Series; No. TR-CTIT-11-29).

Research output: Book/ReportReportProfessional

TY - BOOK

T1 - Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction

AU - Habib, Mena Badieh

AU - van Keulen, Maurice

PY - 2011/12

Y1 - 2011/12

N2 - Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

AB - Named entity extraction and disambiguation have received much attention in recent years. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. This paper addresses two problems with named entity extraction and disambiguation. First, almost no existing works examine the extraction and disambiguation interdependency. Second, existing disambiguation techniques mostly take as input extracted named entities without considering the uncertainty and imperfection of the extraction process. It is the aim of this paper to investigate both avenues and to show that explicit handling of the uncertainty of annotation has much potential for making both extraction and disambiguation more robust. We conducted experiments with a set of holiday home descriptions with the aim to extract and disambiguate toponyms as a representative example of named entities. We show that the effectiveness of extraction influences the effectiveness of disambiguation, and reciprocally, how retraining the extraction models with information automatically derived from the disambiguation results, improves the extraction models. This mutual reinforcement is shown to even have an effect after several iterations.

KW - IR-78964

KW - METIS-281655

KW - EWI-21023

KW - Uncertain Annotations

KW - Named Entity Disambiguation

KW - Named Entity Extraction

M3 - Report

T3 - CTIT Technical Report Series

BT - Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction

PB - Centre for Telematics and Information Technology (CTIT)

CY - Enschede

ER -

Habib MB, van Keulen M. Improving Named Entity Disambiguation by Iteratively Enhancing Certainty of Extraction. Enschede: Centre for Telematics and Information Technology (CTIT), 2011. 8 p. (CTIT Technical Report Series; TR-CTIT-11-29).