TY - JOUR
T1 - Uncertainty analysis of crowd - sourced and professionally collected field data used in species distribution models of Taiwanese moths
AU - Lin, Yu-Pin
AU - Deng, Dengpo
AU - Lin, Wei-Chih
AU - Lemmens, Rob
AU - Crossman, Neville D.
AU - Henle, Klaus
AU - Schmeller, Dirk S.
PY - 2014
Y1 - 2014
N2 - The purposes of this study are to extract the names of species and places for a citizen-science monitoring program, to obtain crowd-sourced data of acceptable quality, and to assess the quality and the uncertainty of predictions based on crowd-sourced data and professional data. We used Natural Language Processing to extract names of species and places from text messages in a citizen science project. Bootstrap and Maximum Entropy methods were used to assess the uncertainty in the model predictions based on crowd-sourced data from the EnjoyMoths project in Taiwan. We compared uncertainty in the predictions obtained from the project and from the Global Biodiversity Information Facility (GBIF) field data for seven focal species of moth. The proximity to locations of easy access and the Ripley K method were used to test the level of spatial bias and randomness of the crowd-sourced data against GBIF data. Our results show that extracting information to identify the names of species and their locations from crowd-sourced data performed well. The results of the spatial bias and randomness tests revealed that the crowd-sourced data and GBIF data did not differ significantly in respect to both spatial bias and clustering. The prediction models developed using the crowd-sourced dataset were the most effective, followed by those that were developed using the combined dataset. Those that performed least well were based on the small sample size GBIF dataset. Our method demonstrates the potential for using data collected by citizen scientists and the extraction of information from vast social networks. Our analysis also shows the value of citizen science data to improve biodiversity information in combination with data collected by professionals.
AB - The purposes of this study are to extract the names of species and places for a citizen-science monitoring program, to obtain crowd-sourced data of acceptable quality, and to assess the quality and the uncertainty of predictions based on crowd-sourced data and professional data. We used Natural Language Processing to extract names of species and places from text messages in a citizen science project. Bootstrap and Maximum Entropy methods were used to assess the uncertainty in the model predictions based on crowd-sourced data from the EnjoyMoths project in Taiwan. We compared uncertainty in the predictions obtained from the project and from the Global Biodiversity Information Facility (GBIF) field data for seven focal species of moth. The proximity to locations of easy access and the Ripley K method were used to test the level of spatial bias and randomness of the crowd-sourced data against GBIF data. Our results show that extracting information to identify the names of species and their locations from crowd-sourced data performed well. The results of the spatial bias and randomness tests revealed that the crowd-sourced data and GBIF data did not differ significantly in respect to both spatial bias and clustering. The prediction models developed using the crowd-sourced dataset were the most effective, followed by those that were developed using the combined dataset. Those that performed least well were based on the small sample size GBIF dataset. Our method demonstrates the potential for using data collected by citizen scientists and the extraction of information from vast social networks. Our analysis also shows the value of citizen science data to improve biodiversity information in combination with data collected by professionals.
KW - ITC-ISI-JOURNAL-ARTICLE
KW - Social media
KW - Citizen science
KW - Volunteer survey
KW - Prediction of species distribution
KW - Uncertainty
KW - Natural language
KW - Large-scale monitoring program
UR - https://ezproxy2.utwente.nl/login?url=http://dx.doi.org/10.1016/j.biocon.2014.11.012
UR - https://ezproxy2.utwente.nl/login?url=https://webapps.itc.utwente.nl/library/2015/isi/lemmens_unc.pdf
U2 - 10.1016/j.biocon.2014.11.012
DO - 10.1016/j.biocon.2014.11.012
M3 - Article
SN - 0006-3207
VL - 181
SP - 102
EP - 110
JO - Biological conservation
JF - Biological conservation
ER -