Using Parsimonious Language Models on Web Data

Rianne Kaptein, Rongmei Li, Djoerd Hiemstra, Jaap Kamps

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

5 Downloads (Pure)

Abstract

In this paper we explore the use of parsimonious language models for web retrieval. These models are smaller thus more efficient than the standard language models and are therefore well suited for large-scale web retrieval. We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets and measures. In all cases the improvement is significant, and more substantial than in earlier experiments on newspaper/newswire data.
Original languageEnglish
Title of host publicationSIGIR '08
Subtitle of host publicationProceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsTat-Seng Chua, Mun-Kew Leong
Place of PublicationNew York NY, USA
PublisherACM Press
Pages763-764
Number of pages2
ISBN (Print)978-1-60558-164-4
DOIs
Publication statusPublished - 20 Jul 2008
Event31st Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008 - Singapore, Singapore
Duration: 20 Jul 200825 Jul 2008
Conference number: 31

Conference

Conference31st Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008
Abbreviated titleSIGIR
CountrySingapore
CitySingapore
Period20/07/0825/07/08

Keywords

  • DB-IR: INFORMATION RETRIEVAL
  • IR-64723
  • METIS-250954
  • EWI-12282

Fingerprint Dive into the research topics of 'Using Parsimonious Language Models on Web Data'. Together they form a unique fingerprint.

  • Cite this

    Kaptein, R., Li, R., Hiemstra, D., & Kamps, J. (2008). Using Parsimonious Language Models on Web Data. In T-S. Chua, & M-K. Leong (Eds.), SIGIR '08: Proceedings of the 31th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 763-764). New York NY, USA: ACM Press. https://doi.org/10.1145/1390334.1390491