TY - JOUR
T1 - Entropy-based discretization methods for ranking data
AU - De Sá, Cláudio Rebelo
AU - Soares, Carlos
AU - Knobbe, Arno
PY - 2016/2/1
Y1 - 2016/2/1
N2 - Label Ranking (LR) problems are becoming increasingly important in Machine Learning. While there has been a significant amount of work on the development of learning algorithms for LR in recent years, there are not many pre-processing methods for LR. Some methods, like Naive Bayes for LR and APRIORI-LR, cannot handle real-valued data directly. Conventional discretization methods used in classification are not suitable for LR problems, due to the different target variable. In this work, we make an extensive analysis of the existing methods using simple approaches. We also propose a new method called EDiRa (Entropy-based Discretization for Ranking) for the discretization of ranking data. We illustrate the advantages of the method using synthetic data and also on several benchmark datasets. The results clearly indicate that the discretization is performing as expected and also improves the results and efficiency of the learning algorithms.
AB - Label Ranking (LR) problems are becoming increasingly important in Machine Learning. While there has been a significant amount of work on the development of learning algorithms for LR in recent years, there are not many pre-processing methods for LR. Some methods, like Naive Bayes for LR and APRIORI-LR, cannot handle real-valued data directly. Conventional discretization methods used in classification are not suitable for LR problems, due to the different target variable. In this work, we make an extensive analysis of the existing methods using simple approaches. We also propose a new method called EDiRa (Entropy-based Discretization for Ranking) for the discretization of ranking data. We illustrate the advantages of the method using synthetic data and also on several benchmark datasets. The results clearly indicate that the discretization is performing as expected and also improves the results and efficiency of the learning algorithms.
KW - Association Rule Mining
KW - Discretization
KW - Label ranking
KW - Minimum description length
UR - http://www.scopus.com/inward/record.url?scp=84949731124&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2015.04.022
DO - 10.1016/j.ins.2015.04.022
M3 - Article
AN - SCOPUS:84949731124
VL - 329
SP - 921
EP - 936
JO - Information sciences
JF - Information sciences
SN - 0020-0255
ER -