Query-Based Sampling using Only Snippets

A.S. Tigelaar, Djoerd Hiemstra

Research output: Book/ReportReportProfessional

80 Downloads (Pure)

Abstract

Query-based sampling is a popular approach to model the content of an uncooperative server. It works by sending queries to the server and downloading the returned documents in the search results in full. This sample of documents then represents the server’s content. We present an approach that uses the document snippets as samples instead of downloading entire documents. This yields more stable results at the same amount of bandwidth usage as the full document approach. Additionally, we show that using snippets does not necessarily incur more latency, but can actually save time.
Original languageUndefined
Place of PublicationEnschede
PublisherCentre for Telematics and Information Technology (CTIT)
Number of pages12
Publication statusPublished - 26 Nov 2009

Publication series

NameCTIT Technical Report Series
PublisherCentre for Telematics and Information Technology, University of Twente
No.TR-CTIT-09-42
ISSN (Print)1381-3625

Keywords

  • IR-68676
  • METIS-265244
  • DB-IR: INFORMATION RETRIEVAL
  • EWI-16572
  • DB-DFDB: DISTRIBUTED OR FEDERATED DATABASES

Cite this