Retrieval system evaluation: automatic evaluation versus incomplete judgments

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    5 Citations (Scopus)

    Abstract

    In information retrieval (IR), research aiming to reduce the cost of retrieval system evaluations has been conducted along two lines: (i) the evaluation of IR systems with reduced amounts of manual relevance assessments, and (ii) the fully automatic evaluation of IR systems, thus foregoing the need for manual assessments altogether. The proposed methods in both areas are commonly evaluated by comparing their performance estimates for a set of systems to a ground truth (provided for instance by evaluating the set of systems according to mean average precision). In contrast, in this poster we compare an automatic system evaluation approach directly to two evaluations based on incomplete manual relevance assessments. For the particular case of TREC's Million Query track, we show that the automatic evaluation leads to results which are highly correlated to those achieved by approaches relying on incomplete manual judgments.
    Original languageUndefined
    Title of host publicationProceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    Place of PublicationNew York
    PublisherAssociation for Computing Machinery (ACM)
    Pages863-864
    Number of pages2
    ISBN (Print)978-1-4503-0153-4
    DOIs
    Publication statusPublished - Jul 2010
    Event33rd Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010 - Geneva, Switzerland
    Duration: 19 Jul 201023 Jul 2010
    Conference number: 33

    Publication series

    Name
    PublisherACM

    Conference

    Conference33rd Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010
    Abbreviated titleSIGIR
    CountrySwitzerland
    CityGeneva
    Period19/07/1023/07/10

    Keywords

    • IR-72484
    • METIS-270945
    • CR-H.3
    • Evaluation
    • EWI-18226
    • Information Retrieval

    Cite this