How Random Is a Classifier Given Its Area under Curve?

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    Abstract

    When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

    Original languageEnglish
    Title of host publication2017 International Conference of the Biometrics Special Interest Group (BIOSIG)
    Subtitle of host publicationBIOSIG 2017
    EditorsArslan Brömme, Christoph Busch, Antitza Dantcheva, Christian Rathgeb, Andreas Uhl
    PublisherGesellschaft für Informatik
    ISBN (Electronic)9783885796640
    DOIs
    Publication statusPublished - 28 Sep 2017
    Event16th International Conference of the Biometrics Special Interest Group 2017 - Darmstadt, Germany
    Duration: 20 Sep 201722 Sep 2017
    Conference number: 16
    http://fg-biosig.gi.de/archiv/biosig-2017.html

    Conference

    Conference16th International Conference of the Biometrics Special Interest Group 2017
    Abbreviated titleBIOSIG 2017
    CountryGermany
    CityDarmstadt
    Period20/09/1722/09/17
    Internet address

    Fingerprint

    Classifiers
    Number theory
    Normal distribution
    Probability distributions

    Keywords

    • Approximation.
    • AUC
    • Exact Distribution
    • Random Classifier

    Cite this

    Zeinstra, C., Veldhuis, R., & Spreeuwers, L. (2017). How Random Is a Classifier Given Its Area under Curve? In A. Brömme, C. Busch, A. Dantcheva, C. Rathgeb, & A. Uhl (Eds.), 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017 [8053509] Gesellschaft für Informatik. https://doi.org/10.23919/BIOSIG.2017.8053509
    Zeinstra, Chris ; Veldhuis, Raymond ; Spreeuwers, Luuk. / How Random Is a Classifier Given Its Area under Curve?. 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. editor / Arslan Brömme ; Christoph Busch ; Antitza Dantcheva ; Christian Rathgeb ; Andreas Uhl. Gesellschaft für Informatik, 2017.
    @inproceedings{6ba53771416449f395dd162662144aae,
    title = "How Random Is a Classifier Given Its Area under Curve?",
    abstract = "When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.",
    keywords = "Approximation., AUC, Exact Distribution, Random Classifier",
    author = "Chris Zeinstra and Raymond Veldhuis and Luuk Spreeuwers",
    year = "2017",
    month = "9",
    day = "28",
    doi = "10.23919/BIOSIG.2017.8053509",
    language = "English",
    editor = "Arslan Br{\"o}mme and Christoph Busch and Antitza Dantcheva and Christian Rathgeb and Andreas Uhl",
    booktitle = "2017 International Conference of the Biometrics Special Interest Group (BIOSIG)",
    publisher = "Gesellschaft f{\"u}r Informatik",
    address = "Germany",

    }

    Zeinstra, C, Veldhuis, R & Spreeuwers, L 2017, How Random Is a Classifier Given Its Area under Curve? in A Brömme, C Busch, A Dantcheva, C Rathgeb & A Uhl (eds), 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017., 8053509, Gesellschaft für Informatik, 16th International Conference of the Biometrics Special Interest Group 2017, Darmstadt, Germany, 20/09/17. https://doi.org/10.23919/BIOSIG.2017.8053509

    How Random Is a Classifier Given Its Area under Curve? / Zeinstra, Chris; Veldhuis, Raymond; Spreeuwers, Luuk.

    2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. ed. / Arslan Brömme; Christoph Busch; Antitza Dantcheva; Christian Rathgeb; Andreas Uhl. Gesellschaft für Informatik, 2017. 8053509.

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    TY - GEN

    T1 - How Random Is a Classifier Given Its Area under Curve?

    AU - Zeinstra, Chris

    AU - Veldhuis, Raymond

    AU - Spreeuwers, Luuk

    PY - 2017/9/28

    Y1 - 2017/9/28

    N2 - When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

    AB - When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

    KW - Approximation.

    KW - AUC

    KW - Exact Distribution

    KW - Random Classifier

    UR - http://www.scopus.com/inward/record.url?scp=85034619394&partnerID=8YFLogxK

    U2 - 10.23919/BIOSIG.2017.8053509

    DO - 10.23919/BIOSIG.2017.8053509

    M3 - Conference contribution

    AN - SCOPUS:85034619394

    BT - 2017 International Conference of the Biometrics Special Interest Group (BIOSIG)

    A2 - Brömme, Arslan

    A2 - Busch, Christoph

    A2 - Dantcheva, Antitza

    A2 - Rathgeb, Christian

    A2 - Uhl, Andreas

    PB - Gesellschaft für Informatik

    ER -

    Zeinstra C, Veldhuis R, Spreeuwers L. How Random Is a Classifier Given Its Area under Curve? In Brömme A, Busch C, Dantcheva A, Rathgeb C, Uhl A, editors, 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. Gesellschaft für Informatik. 2017. 8053509 https://doi.org/10.23919/BIOSIG.2017.8053509