Unsupervised improvement of named entity extraction in short informal context using disambiguation clues

Mena Badieh Habib, Maurice van Keulen

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

16 Citations (Scopus)
148 Downloads (Pure)

Abstract

Short context messages (like tweets and SMS’s) are a potentially rich source of continuously and instantly updated information. Shortness and informality of such messages are challenges for Natural Language Processing tasks. Most efforts done in this direction rely on machine learning techniques which are expensive in terms of data collection and training. In this paper we present an unsupervised Semantic Web-driven approach to improve the extraction process by using clues from the disambiguation process. For extraction we used a simple Knowledge-Base matching technique combined with a clustering-based approach for disambiguation. Experimental results on a self-collected set of tweets (as an example of short context messages) show improvement in extraction results when using unsupervised feedback from the disambiguation process.
Original languageUndefined
Title of host publicationWorkshop on Semantic Web and Information Extraction, SWAIE 2012
Place of PublicationGermany
PublisherCEUR
Pages1-10
Number of pages10
Publication statusPublished - Oct 2012
EventWorkshop on Semantic Web and Information Extraction, SWAIE 2012 - Galway, Ireland
Duration: 8 Oct 201212 Oct 2012

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR-WS.org
Volume925
ISSN (Print)1613-0073

Workshop

WorkshopWorkshop on Semantic Web and Information Extraction, SWAIE 2012
Period8/10/1212/10/12
Other8-12 October 2012

Keywords

  • METIS-289691
  • IR-83356
  • EWI-22245
  • Named Entity RecognitionNamed Entity LinkingNamed Entity ExtractionNamed Entity DisambiguationTwitterTweetsMicroblogs
  • Named Entity Extraction Named Entity Disambiguation Twitter

Cite this