Explaining Topical Distances Using Word Embeddings

Nils Witt, Christin Seifert, Michael Granitzer

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

3 Citations (Scopus)
5 Downloads (Pure)

Abstract

Word and document embeddings have gained a lot of attention recently, because they tend to work well in text mining tasks. Yet, they elude humans intuition. In this paper we are making the attempt to explain the arithmetic difference between two document embeddings by a series of word embeddings. We present an algorithm that iteratively picks words from a vocabulary that closes the topical gap between the documents. Moreover, we present the Econstor16 corpus that was used for the experiments. Although not all words that are found are great matches, the algorithm is able to find sets of words that are reasonable to a human that reads both documents. Remarkably, some of the well-explaining words are mentioned in neither documents.
Original languageEnglish
Title of host publication2016 27th International Workshop on Database and Expert Systems Applications (DEXA)
Place of PublicationPiscataway, NJ
PublisherIEEE
Pages212-217
ISBN (Electronic)978-1-5090-3635-6
ISBN (Print)978-1-5090-3636-3
DOIs
Publication statusPublished - 1 Sept 2016
Externally publishedYes
Event27th International Conference on Database and Expert Systems Applications, DEXA 2016 - Instituto Superior de Engenharia do Porto, Porto, Portugal
Duration: 5 Sept 20168 Sept 2016
Conference number: 27
http://www.dexa.org/previous/dexa2016/dexa2016.html

Conference

Conference27th International Conference on Database and Expert Systems Applications, DEXA 2016
Abbreviated titleDEXA
Country/TerritoryPortugal
CityPorto
Period5/09/168/09/16
Internet address

Keywords

  • n/a OA procedure

Fingerprint

Dive into the research topics of 'Explaining Topical Distances Using Word Embeddings'. Together they form a unique fingerprint.

Cite this