Improved query difficulty prediction for the web

C. Hauff, V. Murdock, R. Baeza-Yates

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    62 Citations (Scopus)
    44 Downloads (Pure)

    Abstract

    Query performance prediction aims to predict whether a query will have a high average precision given retrieval from a particular collection, or low average precision. An accurate estimator of the quality of search engine results can allow the search engine to decide to which queries to apply query expansion, for which queries to suggest alternative search terms, to adjust the sponsored results, or to return results from specialized collections. In this paper we present an evaluation of state of the art query prediction algorithms, both post-retrieval and pre-retrieval and we analyze their sensitivity towards the retrieval algorithm. We evaluate query difficulty predictors over three widely different collections and query sets and present an analysis of why prediction algorithms perform significantly worse on Web data. Finally we introduce Improved Clarity, and demonstrate that it outperforms state-of-the-art predictors on three standard collections, including two large Web collections.
    Original languageUndefined
    Title of host publicationCIKM '08: Proceeding of the 17th ACM conference on Information and knowledge mining
    Place of PublicationNew York, NY, USA
    PublisherAssociation for Computing Machinery
    Pages439-448
    Number of pages10
    ISBN (Print)978-1-59593-991-3
    DOIs
    Publication statusPublished - 2008
    Event17th ACM Conference on Information and Knowledge Management, CIKM 2008 - Napa Valley, United States
    Duration: 26 Oct 200830 Oct 2008
    Conference number: 17

    Publication series

    Name
    PublisherACM
    Number08332

    Conference

    Conference17th ACM Conference on Information and Knowledge Management, CIKM 2008
    Abbreviated titleCIKM
    Country/TerritoryUnited States
    CityNapa Valley
    Period26/10/0830/10/08

    Keywords

    • IR-62541
    • METIS-252112
    • EWI-14111
    • HMI-IE: Information Engineering
    • CR-H.3

    Cite this