TY - JOUR
T1 - Evaluating real-life performance of the state-of-the-art in facial expression recognition using a novel YouTube-based datasets
AU - Siddiqi, Muhammad Hameed
AU - Ali, Maqbool
AU - Abdelrahman Eldib, Mohamed Elsayed
AU - Khan, Asfandyar
AU - Banos, Oresti
AU - Khan, Adil Mehmood
AU - Lee, Sungyoung
AU - Choo, Hyunseung
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Facial expression recognition (FER) is one of the most active areas of research in computer science, due to its importance in a large number of application domains. Over the years, a great number of FER systems have been implemented, each surpassing the other in terms of classification accuracy. However, one major weakness found in the previous studies is that they have all used standard datasets for their evaluations and comparisons. Though this serves well given the needs of a fair comparison with existing systems, it is argued that this does not go in hand with the fact that these systems are built with a hope of eventually being used in the real-world. It is because these datasets assume a predefined camera setup, consist of mostly posed expressions collected in a controlled setting, using fixed background and static ambient settings, and having low variations in the face size and camera angles, which is not the case in a dynamic real-world. The contributions of this work are two-fold: firstly, using numerous online resources and also our own setup, we have collected a rich FER dataset keeping in mind the above mentioned problems. Secondly, we have chosen eleven state-of-the-art FER systems, implemented them and performed a rigorous evaluation of these systems using our dataset. The results confirm our hypothesis that even the most accurate existing FER systems are not ready to face the challenges of a dynamic real-world. We hope that our dataset would become a benchmark to assess the real-life performance of future FER systems.
AB - Facial expression recognition (FER) is one of the most active areas of research in computer science, due to its importance in a large number of application domains. Over the years, a great number of FER systems have been implemented, each surpassing the other in terms of classification accuracy. However, one major weakness found in the previous studies is that they have all used standard datasets for their evaluations and comparisons. Though this serves well given the needs of a fair comparison with existing systems, it is argued that this does not go in hand with the fact that these systems are built with a hope of eventually being used in the real-world. It is because these datasets assume a predefined camera setup, consist of mostly posed expressions collected in a controlled setting, using fixed background and static ambient settings, and having low variations in the face size and camera angles, which is not the case in a dynamic real-world. The contributions of this work are two-fold: firstly, using numerous online resources and also our own setup, we have collected a rich FER dataset keeping in mind the above mentioned problems. Secondly, we have chosen eleven state-of-the-art FER systems, implemented them and performed a rigorous evaluation of these systems using our dataset. The results confirm our hypothesis that even the most accurate existing FER systems are not ready to face the challenges of a dynamic real-world. We hope that our dataset would become a benchmark to assess the real-life performance of future FER systems.
KW - Classification
KW - Facial expressions
KW - Real-life scenarios
KW - YouTube
KW - n/a OA procedure
UR - http://www.scopus.com/inward/record.url?scp=85008511962&partnerID=8YFLogxK
U2 - 10.1007/s11042-016-4321-2
DO - 10.1007/s11042-016-4321-2
M3 - Article
AN - SCOPUS:85008511962
SN - 1380-7501
VL - 77
SP - 917
EP - 937
JO - Multimedia tools and applications
JF - Multimedia tools and applications
IS - 1
ER -