Abstract
Original language  English 

Awarding Institution 

Supervisors/Advisors 

Award date  24 Feb 2017 
Place of Publication  Enschede 
Publisher  
Print ISBNs  9789036542876 
DOIs  
Publication status  Published  24 Feb 2017 
Fingerprint
Keywords
 IR103361
 METIS321417
Cite this
}
Dueling bandits for online ranker evaluation. / Zoghi, Masrour.
Enschede : University of Twente, 2017. 104 p.Research output: Thesis › PhD Thesis  Research external, graduation UT › Academic
TY  THES
T1  Dueling bandits for online ranker evaluation
AU  Zoghi, Masrour
PY  2017/2/24
Y1  2017/2/24
N2  In every domain where a service or a product is provided, an important question is that of evaluation: given a set of possible choices for deployment, what is the best one? An important example, which is considered in this work, is that of ranker evaluation from the field of information retrieval (IR). The goal of IR is to satisfy the information need of a user in response to a query issued by them, where this information need is typically satisfied by a document (or a small set of documents) contained in what is often a much larger collection. This goal is often attained by ranking the documents according to their usefulness to the issued query using an algorithm, called a ranker, a procedure that takes as input a query and a set of documents and specifies how the documents need to be ordered. This thesis is concerned with ranker evaluation. The goal of ranker evaluation is to determine the quality of the rankers under consideration to allow us to choose the best option: given a finite set of possible rankers, which one of them leads to the highest level of user satisfaction? There are two main methods for carrying this out: absolute metrics and relative comparisons. This thesis is concerned with the second, relative form of ranker evaluation because it is more efficient at distinguishing between rankers of different quality: for instance interleaved comparisons take a fraction of the time required by A/B testing, but they produce the same outcome. More precisely, the problem of online ranker evaluation from relative feedback can be described as follows: given a finite set of rankers, choose the best using only pairwise comparisons between the rankers under consideration, while minimizing the number of comparisons involving suboptimal rankers. This problem is an instance of what is referred to as the dueling bandit problem in the literature. The main contribution of this thesis is devising a dueling bandit algorithm, called Copeland Confidence Bounds (CCB), that solves this problem under practically general assumptions and providing theoretical guarantees for its proper functioning. In addition to that, the thesis contains a number of other algorithms that are better suited for dueling bandit problems with particular properties.
AB  In every domain where a service or a product is provided, an important question is that of evaluation: given a set of possible choices for deployment, what is the best one? An important example, which is considered in this work, is that of ranker evaluation from the field of information retrieval (IR). The goal of IR is to satisfy the information need of a user in response to a query issued by them, where this information need is typically satisfied by a document (or a small set of documents) contained in what is often a much larger collection. This goal is often attained by ranking the documents according to their usefulness to the issued query using an algorithm, called a ranker, a procedure that takes as input a query and a set of documents and specifies how the documents need to be ordered. This thesis is concerned with ranker evaluation. The goal of ranker evaluation is to determine the quality of the rankers under consideration to allow us to choose the best option: given a finite set of possible rankers, which one of them leads to the highest level of user satisfaction? There are two main methods for carrying this out: absolute metrics and relative comparisons. This thesis is concerned with the second, relative form of ranker evaluation because it is more efficient at distinguishing between rankers of different quality: for instance interleaved comparisons take a fraction of the time required by A/B testing, but they produce the same outcome. More precisely, the problem of online ranker evaluation from relative feedback can be described as follows: given a finite set of rankers, choose the best using only pairwise comparisons between the rankers under consideration, while minimizing the number of comparisons involving suboptimal rankers. This problem is an instance of what is referred to as the dueling bandit problem in the literature. The main contribution of this thesis is devising a dueling bandit algorithm, called Copeland Confidence Bounds (CCB), that solves this problem under practically general assumptions and providing theoretical guarantees for its proper functioning. In addition to that, the thesis contains a number of other algorithms that are better suited for dueling bandit problems with particular properties.
KW  IR103361
KW  METIS321417
U2  10.3990/1.9789036542876
DO  10.3990/1.9789036542876
M3  PhD Thesis  Research external, graduation UT
SN  9789036542876
PB  University of Twente
CY  Enschede
ER 