Abstract
In this paper, scientific species names from images of handwritten species observations are automatically recognised and annotated with semantic concepts, so that they can be used for document retrieval and faceted search. Until now, automated semantic annotation of such named entities was only applied to printed or digital text. We employ a two-step approach. First, word images are classified, identifying elements of scientific species names; Genus, species, author, using (i) visual structural features, (ii) position, and (iii) context. Second, the identified species names are semantically annotated according to the NHC-Ontology, an ontology that describes species observations. Internationalised Resource Identifiers (IRIs) are assigned to the elements so that they can be linked and disambiguated at a later stage by individual researchers. For the identification of scientific species names, we achieve an average F1 score of 0.86. Moreover, we discuss how our method will function in a semi-automated annotation process, with a fruitful dialogue between system and user as the main objective.
Original language | English |
---|---|
Title of host publication | Advances in Information Retrieval |
Subtitle of host publication | 41st European Conference on IR Research, ECIR 2019, Proceedings, Part I |
Editors | Leif Azzopardi, Benno Stein, Norbert Fuhr, Philipp Mayr, Claudia Hauff, Djoerd Hiemstra |
Place of Publication | Cham |
Publisher | Springer |
Pages | 667-680 |
Number of pages | 14 |
ISBN (Electronic) | 978-3-030-15712-8 |
ISBN (Print) | 978-3-030-15711-1 |
DOIs | |
Publication status | Published - 7 Apr 2019 |
Event | 41st European Conference on Information Retrieval, ECIR 2019 - Cologne, Germany Duration: 14 Apr 2019 → 18 Apr 2019 Conference number: 41 http://ecir2019.org/ |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11437 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 41st European Conference on Information Retrieval, ECIR 2019 |
---|---|
Abbreviated title | ECIR 2019 |
Country/Territory | Germany |
City | Cologne |
Period | 14/04/19 → 18/04/19 |
Internet address |
Keywords
- Deep learning
- Historical biodiversity research
- Ontologies
- Scientific names
- Semantic annotation
- Taxonomy