Abstract
Skip-Gram word embeddings, estimated from large text corpora, have been shown to improve many NLP tasks through their high-quality features. However, little is known about their robustness against parameter perturbations and about their efficiency in preserving word similarities under memory constraints. In this paper, we investigate three post-processing methods for word embeddings to study their robustness and memory efficiency. We employ a dimensionality-based, a parameter-based and a resolution-based method to obtain parameterreduced embeddings and we provide a concept that connects the three approaches. We contrast these methods with the relative accuracy loss on six intrinsic evaluation tasks and compare them with regard to the memory efficiency of the reduced embeddings. The evaluation shows that low Bit-resolution embeddings offer great potential for memory savings by alleviating the risk of accuracy loss. The results indicate that postprocessed word embeddings could also enhance applications on resource limited devices with valuable word features.
Original language | English |
---|---|
Title of host publication | Advances in Information Retrieval |
Subtitle of host publication | 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20–23, 2016. Proceedings |
Editors | Nicola Ferro, Fabio Crestani, Marie-Francine Moens, Josiane Mothe, Fabrizio Silvestri, Giogio Maria Di Nunzio, Claudia Hauff, Gianmaria Silvello |
Publisher | Springer |
Pages | 200-211 |
Number of pages | 12 |
ISBN (Electronic) | 978-3-319-30671-1 |
ISBN (Print) | 978-3-319-30670-4 |
DOIs | |
Publication status | Published - 2016 |
Externally published | Yes |
Event | 38th European Conference on Information Retrieval 2016 - Padua, Italy Duration: 20 Mar 2016 → 23 Mar 2016 Conference number: 38 http://ecir2016.dei.unipd.it/ |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 9626 |
Conference
Conference | 38th European Conference on Information Retrieval 2016 |
---|---|
Abbreviated title | ECIR 2016 |
Country/Territory | Italy |
City | Padua |
Period | 20/03/16 → 23/03/16 |
Internet address |
Keywords
- Evaluation
- Memory efficiency
- Natural language processing
- Robustness
- Word embedding