Improved query difficulty prediction for the web

C. Hauff, V. Murdock, R. Baeza-Yates

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    61 Citations (Scopus)
    20 Downloads (Pure)


    Query performance prediction aims to predict whether a query will have a high average precision given retrieval from a particular collection, or low average precision. An accurate estimator of the quality of search engine results can allow the search engine to decide to which queries to apply query expansion, for which queries to suggest alternative search terms, to adjust the sponsored results, or to return results from specialized collections. In this paper we present an evaluation of state of the art query prediction algorithms, both post-retrieval and pre-retrieval and we analyze their sensitivity towards the retrieval algorithm. We evaluate query difficulty predictors over three widely different collections and query sets and present an analysis of why prediction algorithms perform significantly worse on Web data. Finally we introduce Improved Clarity, and demonstrate that it outperforms state-of-the-art predictors on three standard collections, including two large Web collections.
    Original languageUndefined
    Title of host publicationCIKM '08: Proceeding of the 17th ACM conference on Information and knowledge mining
    Place of PublicationNew York, NY, USA
    PublisherAssociation for Computing Machinery
    Number of pages10
    ISBN (Print)978-1-59593-991-3
    Publication statusPublished - 2008
    Event17th ACM Conference on Information and Knowledge Management, CIKM 2008 - Napa Valley, United States
    Duration: 26 Oct 200830 Oct 2008
    Conference number: 17

    Publication series



    Conference17th ACM Conference on Information and Knowledge Management, CIKM 2008
    Abbreviated titleCIKM
    Country/TerritoryUnited States
    CityNapa Valley


    • IR-62541
    • METIS-252112
    • EWI-14111
    • HMI-IE: Information Engineering
    • CR-H.3

    Cite this