Descriptor-Invariant Fusion Architectures for Automatic Subject Indexing

Martin Toepfer, Christin Seifert

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

6 Citations (Scopus)
217 Downloads (Pure)

Abstract

Documents indexed with controlled vocabularies enable users of libraries to discover relevant documents, even across language barriers. Due to the rapid growth of scientific publications, digital libraries require automatic methods that index documents accurately, especially with regard to explicit or implicit concept drift, that is, with respect to new descriptor terms and new types of documents, respectively. This paper first analyzes architectures of related approaches on automatic indexing. We show that their design determines individual strengths and weaknesses and justify research on their fusion. In particular, systems benefit from statistical associative components as well as from lexical components applying dictionary matching, ranking, and binary classification. The analysis emphasizes the importance of descriptor-invariant learning, that is, learning based on features which can be transferred between different descriptors. Theoretic and experimental results on economic titles and author keywords underline the relevance of the fusion methodology in terms of overall accuracy and adaptability to dynamic domains. Experiments show that fusion strategies combining a binary relevance approach and a thesaurus-based system outperform all other strategies on the tested data set. Our findings can help researchers and practitioners in digital libraries to choose appropriate methods for automatic indexing. © 2017 IEEE.
Original languageEnglish
Title of host publication2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
Place of PublicationPiscataway, NJ
PublisherIEEE
ISBN (Electronic)978-1-5386-3861-3
ISBN (Print)978-1-5386-3862-0
DOIs
Publication statusPublished - 25 Jul 2017
Externally publishedYes
EventJoint Conference on Digital Libraries, JCDL 2017 - University of Toronto, Toronto, Canada
Duration: 19 Jun 201723 Jun 2017
https://2017.jcdl.org/

Conference

ConferenceJoint Conference on Digital Libraries, JCDL 2017
Abbreviated titleJCDL 2017
Country/TerritoryCanada
CityToronto
Period19/06/1723/06/17
Internet address

Keywords

  • Automatic subject indexing
  • Keyphrase indexing
  • Meta-learning
  • Multi-label classification
  • Short texts
  • Zero-shot learning

Fingerprint

Dive into the research topics of 'Descriptor-Invariant Fusion Architectures for Automatic Subject Indexing'. Together they form a unique fingerprint.

Cite this