Comparison of topic extraction approaches and their results

Theresa Velden*, Kevin W. Boyack, Jochen Gläser, Rob Koopman, Andrea Scharnhorst, Shenghui Wang

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

27 Citations (Scopus)

Abstract

This is the last paper in the Synthesis section of this special issue on ‘Same Data, Different Results’. We first provide a framework of how to describe and distinguish approaches to topic extraction from bibliographic data of scientific publications. We then compare solutions delivered by the different topic extraction approaches in this special issue, and explore where they agree and differ. This is achieved without reference to a ground truth, since we have to assume the existence of multiple, equally important, valid perspectives and want to avoid bias through the adoption of an ad-hoc yardstick. Instead, we apply different ways to quantitatively and visually compare solutions to explore their commonalities and differences and develop hypotheses about the origin of these differences. We conclude with a discussion of future work needed to develop methods for comparison and validation of topic extraction results, and express our concern about the lack of access to non-proprietary benchmark data sets to support method development in the field of scientometrics.

Original languageEnglish
Pages (from-to)1169-1221
Number of pages53
JournalScientometrics
Volume111
Issue number2
DOIs
Publication statusPublished - 1 May 2017
Externally publishedYes

    Fingerprint

Keywords

  • Astrophysics
  • Clustering
  • Comparative methods
  • Data modeling
  • Science mapping
  • Topic extraction
  • Topic labeling

Cite this

Velden, T., Boyack, K. W., Gläser, J., Koopman, R., Scharnhorst, A., & Wang, S. (2017). Comparison of topic extraction approaches and their results. Scientometrics, 111(2), 1169-1221. https://doi.org/10.1007/s11192-017-2306-1