Automated semantic annotation of species names in handwritten texts

Lise Stork*, Andreas Weber, Jaap van den Herik, Aske Plaat, Fons Verbeek, Katherine Wolstencroft

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

3 Citations (Scopus)
7 Downloads (Pure)


In this paper, scientific species names from images of handwritten species observations are automatically recognised and annotated with semantic concepts, so that they can be used for document retrieval and faceted search. Until now, automated semantic annotation of such named entities was only applied to printed or digital text. We employ a two-step approach. First, word images are classified, identifying elements of scientific species names; Genus, species, author, using (i) visual structural features, (ii) position, and (iii) context. Second, the identified species names are semantically annotated according to the NHC-Ontology, an ontology that describes species observations. Internationalised Resource Identifiers (IRIs) are assigned to the elements so that they can be linked and disambiguated at a later stage by individual researchers. For the identification of scientific species names, we achieve an average F1 score of 0.86. Moreover, we discuss how our method will function in a semi-automated annotation process, with a fruitful dialogue between system and user as the main objective.

Original languageEnglish
Title of host publicationAdvances in Information Retrieval
Subtitle of host publication41st European Conference on IR Research, ECIR 2019, Proceedings, Part I
EditorsLeif Azzopardi, Benno Stein, Norbert Fuhr, Philipp Mayr, Claudia Hauff, Djoerd Hiemstra
Place of PublicationCham
Number of pages14
ISBN (Electronic)978-3-030-15712-8
ISBN (Print)978-3-030-15711-1
Publication statusPublished - 7 Apr 2019
Event41st European Conference on Information Retrieval, ECIR 2019 - Cologne, Germany
Duration: 14 Apr 201918 Apr 2019
Conference number: 41

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11437 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference41st European Conference on Information Retrieval, ECIR 2019
Abbreviated titleECIR 2019
Internet address


  • Deep learning
  • Historical biodiversity research
  • Ontologies
  • Scientific names
  • Semantic annotation
  • Taxonomy


Dive into the research topics of 'Automated semantic annotation of species names in handwritten texts'. Together they form a unique fingerprint.

Cite this