Evaluating memory efficiency and robustness of word embeddings

Johannes Jurgovsky, Michael Granitzer, Christin Seifert

    Research output: Chapter in Book/Report/Conference proceedingChapterAcademicpeer-review

    4 Citations (Scopus)

    Abstract

    Skip-Gram word embeddings, estimated from large text corpora, have been shown to improve many NLP tasks through their high-quality features. However, little is known about their robustness against parameter perturbations and about their efficiency in preserving word similarities under memory constraints. In this paper, we investigate three post-processing methods for word embeddings to study their robustness and memory efficiency. We employ a dimensionality-based, a parameter-based and a resolution-based method to obtain parameterreduced embeddings and we provide a concept that connects the three approaches. We contrast these methods with the relative accuracy loss on six intrinsic evaluation tasks and compare them with regard to the memory efficiency of the reduced embeddings. The evaluation shows that low Bit-resolution embeddings offer great potential for memory savings by alleviating the risk of accuracy loss. The results indicate that postprocessed word embeddings could also enhance applications on resource limited devices with valuable word features.
    Original languageEnglish
    Title of host publicationAdvances in Information Retrieval
    Subtitle of host publication38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20–23, 2016. Proceedings
    EditorsNicola Ferro, Fabio Crestani, Marie-Francine Moens, Josiane Mothe, Fabrizio Silvestri, Giogio Maria Di Nunzio, Claudia Hauff, Gianmaria Silvello
    PublisherSpringer
    Pages200-211
    Number of pages12
    ISBN (Electronic)978-3-319-30671-1
    ISBN (Print)978-3-319-30670-4
    DOIs
    Publication statusPublished - 2016
    Event38th European Conference on Information Retrieval 2016 - Padua, Italy
    Duration: 20 Mar 201623 Mar 2016
    Conference number: 38
    http://ecir2016.dei.unipd.it/

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume9626

    Conference

    Conference38th European Conference on Information Retrieval 2016
    Abbreviated titleECIR 2016
    CountryItaly
    CityPadua
    Period20/03/1623/03/16
    Internet address

    Keywords

    • Evaluation
    • Memory efficiency
    • Natural language processing
    • Robustness
    • Word embedding

    Fingerprint Dive into the research topics of 'Evaluating memory efficiency and robustness of word embeddings'. Together they form a unique fingerprint.

  • Cite this

    Jurgovsky, J., Granitzer, M., & Seifert, C. (2016). Evaluating memory efficiency and robustness of word embeddings. In N. Ferro, F. Crestani, M-F. Moens, J. Mothe, F. Silvestri, G. M. Di Nunzio, C. Hauff, ... G. Silvello (Eds.), Advances in Information Retrieval: 38th European Conference on IR Research, ECIR 2016, Padua, Italy, March 20–23, 2016. Proceedings (pp. 200-211). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9626). Springer. https://doi.org/10.1007/978-3-319-30671-1_15