Exploring Topic-based Language Models for Effective Web Information Retrieval

Rongmei Li, Rianne Kaptein, Djoerd Hiemstra, Jaap Kamps

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

38 Downloads (Pure)


The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model.
Original languageEnglish
Title of host publicationProceedings of the Dutch-Belgian Information Retrieval Workshop (DIR 2008)
EditorsE. Hoenkamp, M. De Cock, V. Hoste
Place of PublicationEnschede
PublisherNeslia Paniculata
Number of pages7
ISBN (Print)978-90-5681-282-9
Publication statusPublished - 14 Apr 2008
Event8th Dutch-Belgian Information Retrieval Workshop, DIR 2008 - Maastricht, Netherlands
Duration: 14 Apr 200815 Apr 2008
Conference number: 8


Conference8th Dutch-Belgian Information Retrieval Workshop, DIR 2008
Abbreviated titleDIR


  • IR-64722
  • METIS-250952
  • EWI-12277


Dive into the research topics of 'Exploring Topic-based Language Models for Effective Web Information Retrieval'. Together they form a unique fingerprint.

Cite this