How Random Is a Classifier Given Its Area under Curve?

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

Original languageEnglish
Title of host publication2017 International Conference of the Biometrics Special Interest Group (BIOSIG)
Subtitle of host publicationBIOSIG 2017
EditorsArslan Brömme, Christoph Busch, Antitza Dantcheva, Christian Rathgeb, Andreas Uhl
PublisherGesellschaft für Informatik
ISBN (Electronic)9783885796640
DOIs
Publication statusPublished - 28 Sep 2017
Event16th International Conference of the Biometrics Special Interest Group 2017 - Darmstadt, Germany
Duration: 20 Sep 201722 Sep 2017
Conference number: 16
http://fg-biosig.gi.de/archiv/biosig-2017.html

Conference

Conference16th International Conference of the Biometrics Special Interest Group 2017
Abbreviated titleBIOSIG 2017
CountryGermany
CityDarmstadt
Period20/09/1722/09/17
Internet address

Fingerprint

Classifiers
Number theory
Normal distribution
Probability distributions

Keywords

  • Approximation.
  • AUC
  • Exact Distribution
  • Random Classifier

Cite this

Zeinstra, C., Veldhuis, R., & Spreeuwers, L. (2017). How Random Is a Classifier Given Its Area under Curve? In A. Brömme, C. Busch, A. Dantcheva, C. Rathgeb, & A. Uhl (Eds.), 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017 [8053509] Gesellschaft für Informatik. https://doi.org/10.23919/BIOSIG.2017.8053509
Zeinstra, Chris ; Veldhuis, Raymond ; Spreeuwers, Luuk. / How Random Is a Classifier Given Its Area under Curve?. 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. editor / Arslan Brömme ; Christoph Busch ; Antitza Dantcheva ; Christian Rathgeb ; Andreas Uhl. Gesellschaft für Informatik, 2017.
@inproceedings{6ba53771416449f395dd162662144aae,
title = "How Random Is a Classifier Given Its Area under Curve?",
abstract = "When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.",
keywords = "Approximation., AUC, Exact Distribution, Random Classifier",
author = "Chris Zeinstra and Raymond Veldhuis and Luuk Spreeuwers",
year = "2017",
month = "9",
day = "28",
doi = "10.23919/BIOSIG.2017.8053509",
language = "English",
editor = "Arslan Br{\"o}mme and Christoph Busch and Antitza Dantcheva and Christian Rathgeb and Andreas Uhl",
booktitle = "2017 International Conference of the Biometrics Special Interest Group (BIOSIG)",
publisher = "Gesellschaft f{\"u}r Informatik",
address = "Germany",

}

Zeinstra, C, Veldhuis, R & Spreeuwers, L 2017, How Random Is a Classifier Given Its Area under Curve? in A Brömme, C Busch, A Dantcheva, C Rathgeb & A Uhl (eds), 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017., 8053509, Gesellschaft für Informatik, 16th International Conference of the Biometrics Special Interest Group 2017, Darmstadt, Germany, 20/09/17. https://doi.org/10.23919/BIOSIG.2017.8053509

How Random Is a Classifier Given Its Area under Curve? / Zeinstra, Chris; Veldhuis, Raymond; Spreeuwers, Luuk.

2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. ed. / Arslan Brömme; Christoph Busch; Antitza Dantcheva; Christian Rathgeb; Andreas Uhl. Gesellschaft für Informatik, 2017. 8053509.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - How Random Is a Classifier Given Its Area under Curve?

AU - Zeinstra, Chris

AU - Veldhuis, Raymond

AU - Spreeuwers, Luuk

PY - 2017/9/28

Y1 - 2017/9/28

N2 - When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

AB - When the performance of a classifier is empirically evaluated, the Area Under Curve (AUC) is commonly used as a one dimensional performance measure. In general, the focus is on good performance (AUC towards 1). In this paper, we study the other side of the performance spectrum (AUC towards 0.50) as we are interested to which extend a classifier is random given its AUC. We present the exact probability distribution of the AUC of a truely random classifier, given a finite number of distinct genuine and imposter scores. It quantifies the randomness of the measured AUC. The distribution involves the restricted partition function, a well studied function in number theory. Although other work exists that considers confidence bounds on the AUC, the novelty is that we do not assume any underlying parametric or non- parametric model or specify an error rate. Also, in cases in which a limited number of scores is available, for example in forensic case work, the exact distribution can deviate from these models. For completeness, we also present an approximation using a normal distribution and confidence bounds on the AUC.

KW - Approximation.

KW - AUC

KW - Exact Distribution

KW - Random Classifier

UR - http://www.scopus.com/inward/record.url?scp=85034619394&partnerID=8YFLogxK

U2 - 10.23919/BIOSIG.2017.8053509

DO - 10.23919/BIOSIG.2017.8053509

M3 - Conference contribution

BT - 2017 International Conference of the Biometrics Special Interest Group (BIOSIG)

A2 - Brömme, Arslan

A2 - Busch, Christoph

A2 - Dantcheva, Antitza

A2 - Rathgeb, Christian

A2 - Uhl, Andreas

PB - Gesellschaft für Informatik

ER -

Zeinstra C, Veldhuis R, Spreeuwers L. How Random Is a Classifier Given Its Area under Curve? In Brömme A, Busch C, Dantcheva A, Rathgeb C, Uhl A, editors, 2017 International Conference of the Biometrics Special Interest Group (BIOSIG): BIOSIG 2017. Gesellschaft für Informatik. 2017. 8053509 https://doi.org/10.23919/BIOSIG.2017.8053509