In this paper, we evaluate the Lbl2Vec approach for unsupervised text document classification. Lbl2Vec requires only a small number of keywords describing the respective classes to create semantic label representations. For classification, Lbl2Vec uses cosine similarities between label and document representations, but no annotation information. We show that Lbl2Vec significantly outperforms common unsupervised text classification approaches and a widely used zero-shot text classification approach. Furthermore, we show that using more precise keywords can significantly improve the classification results of similarity-based text classification approaches.
|Title of host publication||Web Information Systems and Technologies|
|Subtitle of host publication||16th International Conference, WEBIST 2020, November 3–5, 2020, and 17th International Conference, WEBIST 2021, October 26–28, 2021, Virtual Events, Revised Selected Papers|
|Editors||Massimo Marchiori, Francisco José Domínguez Mayo, Joaquim Filipe|
|Place of Publication||Cham|
|Number of pages||15|
|Publication status||Published - 18 Jan 2023|
|Name||Lecture Notes in Business Information Processing|