TY - JOUR
T1 - Quick and robust feature selection
T2 - the strength of energy-efficient sparse training for autoencoders
AU - Atashgahi, Zahra
AU - Sokar, Ghada
AU - van der Lee, Tim
AU - Mocanu, Elena
AU - Mocanu, Decebal Constantin
AU - Veldhuis, Raymond
AU - Pechenizkiy, Mykola
N1 - Funding Information:
This research has been partly funded by the NWO EDIC project.
Publisher Copyright:
© 2021, The Author(s).
PY - 2022/1
Y1 - 2022/1
N2 - Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a dataset, has been introduced as a solution to this problem. Most of the existing feature selection methods are computationally inefficient; inefficient algorithms lead to high energy consumption, which is not desirable for devices with limited computational and energy resources. In this paper, a novel and flexible method for unsupervised feature selection is proposed. This method, named QuickSelection (The code is available at: https://github.com/zahraatashgahi/QuickSelection), introduces the strength of the neuron in sparse neural networks as a criterion to measure the feature importance. This criterion, blended with sparsely connected denoising autoencoders trained with the sparse evolutionary training procedure, derives the importance of all input features simultaneously. We implement QuickSelection in a purely sparse manner as opposed to the typical approach of using a binary mask over connections to simulate sparsity. It results in a considerable speed increase and memory reduction. When tested on several benchmark datasets, including five low-dimensional and three high-dimensional datasets, the proposed method is able to achieve the best trade-off of classification and clustering accuracy, running time, and maximum memory usage, among widely used approaches for feature selection. Besides, our proposed method requires the least amount of energy among the state-of-the-art autoencoder-based feature selection methods.
AB - Major complications arise from the recent increase in the amount of high-dimensional data, including high computational costs and memory requirements. Feature selection, which identifies the most relevant and informative attributes of a dataset, has been introduced as a solution to this problem. Most of the existing feature selection methods are computationally inefficient; inefficient algorithms lead to high energy consumption, which is not desirable for devices with limited computational and energy resources. In this paper, a novel and flexible method for unsupervised feature selection is proposed. This method, named QuickSelection (The code is available at: https://github.com/zahraatashgahi/QuickSelection), introduces the strength of the neuron in sparse neural networks as a criterion to measure the feature importance. This criterion, blended with sparsely connected denoising autoencoders trained with the sparse evolutionary training procedure, derives the importance of all input features simultaneously. We implement QuickSelection in a purely sparse manner as opposed to the typical approach of using a binary mask over connections to simulate sparsity. It results in a considerable speed increase and memory reduction. When tested on several benchmark datasets, including five low-dimensional and three high-dimensional datasets, the proposed method is able to achieve the best trade-off of classification and clustering accuracy, running time, and maximum memory usage, among widely used approaches for feature selection. Besides, our proposed method requires the least amount of energy among the state-of-the-art autoencoder-based feature selection methods.
KW - Deep learning
KW - Feature selection
KW - Sparse autoencoders
KW - Sparse training
UR - http://www.scopus.com/inward/record.url?scp=85117918920&partnerID=8YFLogxK
U2 - 10.1007/s10994-021-06063-x
DO - 10.1007/s10994-021-06063-x
M3 - Article
AN - SCOPUS:85117918920
SN - 0885-6125
VL - 111
SP - 377
EP - 414
JO - Machine Learning
JF - Machine Learning
ER -