The Importance of Prior Probabilities for Entry Page Search

W. Kraaij, T.H.W. Westerveld, Djoerd Hiemstra

Research output: Contribution to conferencePaper

Abstract

An important class of searches on the world-wide-web has the goal to find an entry page (homepage) of an organisation. Entry page search is quite different from Ad Hoc search. Indeed a plain Ad Hoc system performs disappointingly. We explored three non-content features of web pages: page length, number of incoming links and URL form. Especially the URL form proved to be a good predictor. Using URL form priors we found over 70% of all entry pages at rank 1, and up to 89% in the top 10. Non-content features can easily be embedded in a language model framework as a prior probability.
Original languageUndefined
Pages27-34
Number of pages8
DOIs
Publication statusPublished - Aug 2002
Event25th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2002 - Tampere, Finland
Duration: 11 Aug 200215 Aug 2002
Conference number: 25
http://sigir.org/sigir2002/

Conference

Conference25th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2002
Abbreviated titleSIGIR
CountryFinland
CityTampere
Period11/08/0215/08/02
Internet address

Keywords

  • DB-IR: INFORMATION RETRIEVAL
  • EWI-7274
  • IR-63507

Cite this

Kraaij, W., Westerveld, T. H. W., & Hiemstra, D. (2002). The Importance of Prior Probabilities for Entry Page Search. 27-34. Paper presented at 25th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2002, Tampere, Finland. https://doi.org/10.1145/564376.564383