Evaluating memory efficiency and robustness of word embeddings

Johannes Jurgovsky, Michael Granitzer, Christin Seifert

Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

7 Citations (Scopus)

Abstract

Skip-Gram word embeddings, estimated from large text corpora, have been shown to improve many NLP tasks through their high-quality features. However, little is known about their robustness against parameter perturbations and about their efficiency in preserving word similarities under memory constraints. In this paper, we investigate three post-processing methods for word embeddings to study their robustness and memory efficiency. We employ a dimensionality-based, a parameter-based and a resolution-based method to obtain parameterreduced embeddings and we provide a concept that connects the three approaches. We contrast these methods with the relative accuracy loss on six intrinsic evaluation tasks and compare them with regard to the memory efficiency of the reduced embeddings. The evaluation shows that low Bit-resolution embeddings offer great potential for memory savings by alleviating the risk of accuracy loss. The results indicate that postprocessed word embeddings could also enhance applications on resource limited devices with valuable word features.
Original languageEnglish
Title of host publicationAdvances in Information Retrieval
Subtitle of host publication38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20–23, 2016. Proceedings
EditorsNicola Ferro, Fabio Crestani, Marie-Francine Moens, Josiane Mothe, Fabrizio Silvestri, Giogio Maria Di Nunzio, Claudia Hauff, Gianmaria Silvello
PublisherSpringer
Pages200-211
Number of pages12
ISBN (Electronic)978-3-319-30671-1
ISBN (Print)978-3-319-30670-4
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event38th European Conference on Information Retrieval 2016 - Padua, Italy
Duration: 20 Mar 201623 Mar 2016
Conference number: 38
http://ecir2016.dei.unipd.it/

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9626

Conference

Conference38th European Conference on Information Retrieval 2016
Abbreviated titleECIR 2016
Country/TerritoryItaly
CityPadua
Period20/03/1623/03/16
Internet address

Keywords

  • Evaluation
  • Memory efficiency
  • Natural language processing
  • Robustness
  • Word embedding

Fingerprint

Dive into the research topics of 'Evaluating memory efficiency and robustness of word embeddings'. Together they form a unique fingerprint.

Cite this