Abstract
Capturing semantics in a computable way is desirable for many applications, such as information retrieval, document clustering or classification, etc. Embedding words or documents in a vector space is a common first-step. Different types of embedding techniques have their own characteristics which makes it difficult to choose one for an application. In this paper, we compared a few off-the-shelf word and document embedding methods with our own Ariadne approach in different evaluation tests. We argue that one needs to take into account the specific requirements from the applications to decide which embedding method is more suitable. Also, in order to achieve better retrieval performance, it is worth investigating the combination of bibliometric measures with semantic embedding to improve ranking.
Original language | English |
---|---|
Title of host publication | BIR 2017 |
Subtitle of host publication | 5th Workshop on Bibliometric-enhanced Information Retrieval 2017 |
Editors | Philipp Mayr, Ingo Frommholz, Guillaume Cabanac |
Publisher | CEUR |
Pages | 122-132 |
Number of pages | 11 |
Volume | 1823 |
Publication status | Published - 2017 |
Externally published | Yes |
Event | 5th Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2017 - Aberdeen, United Kingdom Duration: 9 Apr 2017 → 9 Apr 2017 Conference number: 5 |
Publication series
Name | CEUR workshop proceedings |
---|---|
Publisher | Rheinisch Westfälische Technische Hochschule |
ISSN (Print) | 1613-0073 |
Conference
Conference | 5th Workshop on Bibliometric-Enhanced Information Retrieval, BIR 2017 |
---|---|
Abbreviated title | BIR |
Country/Territory | United Kingdom |
City | Aberdeen |
Period | 9/04/17 → 9/04/17 |