TY - JOUR
T1 - Spatiotemporal information extraction from a historic expedition gazetteer
AU - Bekele, M.K.
AU - de By, R.A.
AU - Singh, G.
PY - 2016
Y1 - 2016
N2 - Historic expeditions are events that are flavored by exploratory, scientific, military or geographic characteristics. Such events are often documented in literature, journey notes or personal diaries. A typical historic expedition involves multiple site visits and their descriptions contain spatiotemporal and attributive contexts. Expeditions involve movements in space that can be represented by triplet features (location, time and description). However, such features are implicit and innate parts of textual documents. Extracting the geospatial information from these documents requires understanding the contextualized entities in the text. To this end, we developed a semi-automated framework that has multiple Information Retrieval and Natural Language Processing components to extract the spatiotemporal information from a two-volume historic expedition gazetteer. Our framework has three basic components, namely, the Text Preprocessor, the Gazetteer Processing Machine and the JAPE (Java Annotation Pattern Engine) Transducer. We used the Brazilian Ornithological Gazetteer as an experimental dataset and extracted the spatial and temporal entities from entries that refer to three expeditioners’ site visits (which took place between 1910 and 1926) and mapped the trajectory of each expedition using the extracted information. Finally, one of the mapped trajectories was manually compared with a historical reference map of that expedition to assess the reliability of our framework.
AB - Historic expeditions are events that are flavored by exploratory, scientific, military or geographic characteristics. Such events are often documented in literature, journey notes or personal diaries. A typical historic expedition involves multiple site visits and their descriptions contain spatiotemporal and attributive contexts. Expeditions involve movements in space that can be represented by triplet features (location, time and description). However, such features are implicit and innate parts of textual documents. Extracting the geospatial information from these documents requires understanding the contextualized entities in the text. To this end, we developed a semi-automated framework that has multiple Information Retrieval and Natural Language Processing components to extract the spatiotemporal information from a two-volume historic expedition gazetteer. Our framework has three basic components, namely, the Text Preprocessor, the Gazetteer Processing Machine and the JAPE (Java Annotation Pattern Engine) Transducer. We used the Brazilian Ornithological Gazetteer as an experimental dataset and extracted the spatial and temporal entities from entries that refer to three expeditioners’ site visits (which took place between 1910 and 1926) and mapped the trajectory of each expedition using the extracted information. Finally, one of the mapped trajectories was manually compared with a historical reference map of that expedition to assess the reliability of our framework.
KW - METIS-321632
KW - ITC-ISI-JOURNAL-ARTICLE
KW - ITC-GOLD
UR - http://ezproxy.utwente.nl:2048/login?url=https://webapps.itc.utwente.nl/library/2016/isi/deby_spa.pdf
U2 - 10.3390/ijgi5120221
DO - 10.3390/ijgi5120221
M3 - Article
VL - 5
JO - ISPRS international journal of geo-information
JF - ISPRS international journal of geo-information
SN - 2220-9964
IS - 12
M1 - 221
ER -