Collection Selection with Highly Discriminative Keys

S. Bockting, Djoerd Hiemstra

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

23 Downloads (Pure)

Abstract

The centralized web search paradigm introduces several problems, such as large data traffic requirements for crawling, index freshness problems and problems to index everything. In this study, we look at collection selection using highly discriminative keys and query-driven indexing as part of a distributed web search system. The approach is evaluated on different splits of the TREC WT10g corpus. Experimental results show that the approach outperforms a Dirichlet smoothing language modeling approach for collection selection, if we assume that web servers index their local content.
Original languageUndefined
Title of host publicationProceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval
PublisherCEUR
Pages9-16
Number of pages8
Publication statusPublished - 23 Jul 2009

Publication series

NameCEUR Workshop Series
PublisherCEUR-WS
Volume480
ISSN (Print)1613-0073

Keywords

  • METIS-263974
  • EWI-15896
  • Distributed Information Retrieval
  • IR-67569

Cite this

Bockting, S., & Hiemstra, D. (2009). Collection Selection with Highly Discriminative Keys. In Proceedings of the 7th Workshop on Large-Scale Distributed Systems for Information Retrieval (pp. 9-16). (CEUR Workshop Series; Vol. 480). CEUR.