Semantic Label Representations with Lbl2Vec: A Similarity-Based Approach for Unsupervised Text Classification

Tim Schopf, Daniel Braun, Florian Matthes

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

6 Citations (Scopus)

Abstract

In this paper, we evaluate the Lbl2Vec approach for unsupervised text document classification. Lbl2Vec requires only a small number of keywords describing the respective classes to create semantic label representations. For classification, Lbl2Vec uses cosine similarities between label and document representations, but no annotation information. We show that Lbl2Vec significantly outperforms common unsupervised text classification approaches and a widely used zero-shot text classification approach. Furthermore, we show that using more precise keywords can significantly improve the classification results of similarity-based text classification approaches.
Original languageEnglish
Title of host publicationWeb Information Systems and Technologies
Subtitle of host publication16th International Conference, WEBIST 2020, November 3–5, 2020, and 17th International Conference, WEBIST 2021, October 26–28, 2021, Virtual Events, Revised Selected Papers
EditorsMassimo Marchiori, Francisco José Domínguez Mayo, Joaquim Filipe
Place of PublicationCham
PublisherSpringer
Pages59-73
Number of pages15
ISBN (Print)978-3-031-24197-0
DOIs
Publication statusPublished - 18 Jan 2023

Publication series

NameLecture Notes in Business Information Processing

Keywords

  • NLA

Fingerprint

Dive into the research topics of 'Semantic Label Representations with Lbl2Vec: A Similarity-Based Approach for Unsupervised Text Classification'. Together they form a unique fingerprint.

Cite this