Descriptor-Invariant Fusion Architectures for Automatic Subject Indexing

Martin Toepfer, Christin Seifert

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    6 Citations (Scopus)
    72 Downloads (Pure)

    Abstract

    Documents indexed with controlled vocabularies enable users of libraries to discover relevant documents, even across language barriers. Due to the rapid growth of scientific publications, digital libraries require automatic methods that index documents accurately, especially with regard to explicit or implicit concept drift, that is, with respect to new descriptor terms and new types of documents, respectively. This paper first analyzes architectures of related approaches on automatic indexing. We show that their design determines individual strengths and weaknesses and justify research on their fusion. In particular, systems benefit from statistical associative components as well as from lexical components applying dictionary matching, ranking, and binary classification. The analysis emphasizes the importance of descriptor-invariant learning, that is, learning based on features which can be transferred between different descriptors. Theoretic and experimental results on economic titles and author keywords underline the relevance of the fusion methodology in terms of overall accuracy and adaptability to dynamic domains. Experiments show that fusion strategies combining a binary relevance approach and a thesaurus-based system outperform all other strategies on the tested data set. Our findings can help researchers and practitioners in digital libraries to choose appropriate methods for automatic indexing. © 2017 IEEE.
    Original languageEnglish
    Title of host publication2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
    PublisherIEEE
    ISBN (Print)9781538638613
    DOIs
    Publication statusPublished - 25 Jul 2017
    EventJoint Conference on Digital Libraries, JCDL 2017 - University of Toronto, Toronto, Canada
    Duration: 19 Jun 201723 Jun 2017
    https://2017.jcdl.org/

    Conference

    ConferenceJoint Conference on Digital Libraries, JCDL 2017
    Abbreviated titleJCDL 2017
    CountryCanada
    CityToronto
    Period19/06/1723/06/17
    Internet address

    Keywords

    • automatic subject indexing
    • keyphrase indexing
    • meta-learning
    • multi-label classification
    • short text
    • zero-shot learning

    Fingerprint Dive into the research topics of 'Descriptor-Invariant Fusion Architectures for Automatic Subject Indexing'. Together they form a unique fingerprint.

    Cite this