Traitor: associating concepts using the world wide web

Wanno Drijfhout, J. Oliver, Jundt Oliver, L. Wevers, Djoerd Hiemstra

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

48 Downloads (Pure)

Abstract

We use Common Crawl's 25TB data set of web pages to construct a database of associated concepts using Hadoop. The database can be queried through a web application with two query interfaces. A textual interface allows searching for similarities and differences between multiple concepts using a query language similar to set notation, and a graphical interface allows users to visualize similarity relationships of concepts in a force directed graph.
Original languageUndefined
Title of host publicationProceedings of the 13th Dutch-Belgian Workshop on Information Retrieval, DIR 2013
Place of PublicationAachen, Germany
PublisherCEUR
Pages56-57
Number of pages2
Publication statusPublished - Apr 2013
Event13th Dutch-Belgian Information Retrieval Workshop, DIR 2013 - Delft, Netherlands
Duration: 26 Apr 201326 Apr 2013
Conference number: 13

Publication series

NameCEUR Workshop Proceedings
PublisherCEUR
Volume986
ISSN (Print)1613-0073

Workshop

Workshop13th Dutch-Belgian Information Retrieval Workshop, DIR 2013
Abbreviated titleDIR
CountryNetherlands
CityDelft
Period26/04/1326/04/13

Keywords

  • EWI-23832
  • CR-H.3.1
  • CR-H.3.3
  • METIS-300084
  • Question Answering
  • IR-88328
  • MapReduce
  • Information Extraction

Cite this