Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model

Tingting Ye, Naizhuo Zhao, Xuchao Yang, Zutao Ouyang, Xiaoping Liu, Qian Chen, Kejia Hu, Wenze Yue, Jiaguo Qi, Zhansheng Li, P. Jia

Research output: Contribution to journalArticleAcademicpeer-review

11 Citations (Scopus)

Abstract

Remote sensing image products (e.g. brightness of nighttime lights and land cover/land use types) have been widely used to disaggregate census data to produce gridded population maps for large geographic areas. The advent of the geospatial big data revolution has created additional opportunities to map population distributions at fine resolutions with high accuracy. A considerable proportion of the geospatial data contains semantic information that indicates different categories of human activities occurring at exact geographic locations. Such information is often lacking in remote sensing data. In addition, the remarkable progress in machine learning provides toolkits for demographers to model complex nonlinear correlations between population and heterogeneous geographic covariates. In this study, a typical type of geospatial big data, points-of-interest (POIs), was combined with multi-source remote sensing data in a random forests model to disaggregate the 2010 county-level census population data to 100 × 100 m grids. Compared with the WorldPop population dataset, our population map showed higher accuracy. The root mean square error for population estimates in Beijing, Shanghai, Guangzhou, and Chongqing for this method and WorldPop were 27,829 and 34,193, respectively. The large under-allocation of the population in urban areas and over-allocation in rural areas in the WorldPop dataset was greatly reduced in this new population map. Apart from revealing the effectiveness of POIs in improving population mapping, this study promises the potential of geospatial big data for mapping other socioeconomic parameters in the future.

Original languageEnglish
Pages (from-to)936-946
Number of pages11
JournalScience of the total environment
Volume658
DOIs
Publication statusPublished - 2019

Fingerprint

Remote sensing
Population distribution
remote sensing
census
Land use
Mean square error
Learning systems
Luminance
Semantics
population distribution
rural area
land cover
human activity
urban area
Big data
land use
allocation

Keywords

  • ITC-ISI-JOURNAL-ARTICLE

Cite this

Ye, Tingting ; Zhao, Naizhuo ; Yang, Xuchao ; Ouyang, Zutao ; Liu, Xiaoping ; Chen, Qian ; Hu, Kejia ; Yue, Wenze ; Qi, Jiaguo ; Li, Zhansheng ; Jia, P. / Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model. In: Science of the total environment. 2019 ; Vol. 658. pp. 936-946.
@article{68e79acf21e74c98b1cab888bb7bf3a9,
title = "Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model",
abstract = "Remote sensing image products (e.g. brightness of nighttime lights and land cover/land use types) have been widely used to disaggregate census data to produce gridded population maps for large geographic areas. The advent of the geospatial big data revolution has created additional opportunities to map population distributions at fine resolutions with high accuracy. A considerable proportion of the geospatial data contains semantic information that indicates different categories of human activities occurring at exact geographic locations. Such information is often lacking in remote sensing data. In addition, the remarkable progress in machine learning provides toolkits for demographers to model complex nonlinear correlations between population and heterogeneous geographic covariates. In this study, a typical type of geospatial big data, points-of-interest (POIs), was combined with multi-source remote sensing data in a random forests model to disaggregate the 2010 county-level census population data to 100 × 100 m grids. Compared with the WorldPop population dataset, our population map showed higher accuracy. The root mean square error for population estimates in Beijing, Shanghai, Guangzhou, and Chongqing for this method and WorldPop were 27,829 and 34,193, respectively. The large under-allocation of the population in urban areas and over-allocation in rural areas in the WorldPop dataset was greatly reduced in this new population map. Apart from revealing the effectiveness of POIs in improving population mapping, this study promises the potential of geospatial big data for mapping other socioeconomic parameters in the future.",
keywords = "ITC-ISI-JOURNAL-ARTICLE",
author = "Tingting Ye and Naizhuo Zhao and Xuchao Yang and Zutao Ouyang and Xiaoping Liu and Qian Chen and Kejia Hu and Wenze Yue and Jiaguo Qi and Zhansheng Li and P. Jia",
year = "2019",
doi = "10.1016/j.scitotenv.2018.12.276",
language = "English",
volume = "658",
pages = "936--946",
journal = "Science of the total environment",
issn = "0048-9697",
publisher = "Elsevier",

}

Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model. / Ye, Tingting; Zhao, Naizhuo; Yang, Xuchao; Ouyang, Zutao; Liu, Xiaoping; Chen, Qian; Hu, Kejia; Yue, Wenze; Qi, Jiaguo; Li, Zhansheng; Jia, P.

In: Science of the total environment, Vol. 658, 2019, p. 936-946.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model

AU - Ye, Tingting

AU - Zhao, Naizhuo

AU - Yang, Xuchao

AU - Ouyang, Zutao

AU - Liu, Xiaoping

AU - Chen, Qian

AU - Hu, Kejia

AU - Yue, Wenze

AU - Qi, Jiaguo

AU - Li, Zhansheng

AU - Jia, P.

PY - 2019

Y1 - 2019

N2 - Remote sensing image products (e.g. brightness of nighttime lights and land cover/land use types) have been widely used to disaggregate census data to produce gridded population maps for large geographic areas. The advent of the geospatial big data revolution has created additional opportunities to map population distributions at fine resolutions with high accuracy. A considerable proportion of the geospatial data contains semantic information that indicates different categories of human activities occurring at exact geographic locations. Such information is often lacking in remote sensing data. In addition, the remarkable progress in machine learning provides toolkits for demographers to model complex nonlinear correlations between population and heterogeneous geographic covariates. In this study, a typical type of geospatial big data, points-of-interest (POIs), was combined with multi-source remote sensing data in a random forests model to disaggregate the 2010 county-level census population data to 100 × 100 m grids. Compared with the WorldPop population dataset, our population map showed higher accuracy. The root mean square error for population estimates in Beijing, Shanghai, Guangzhou, and Chongqing for this method and WorldPop were 27,829 and 34,193, respectively. The large under-allocation of the population in urban areas and over-allocation in rural areas in the WorldPop dataset was greatly reduced in this new population map. Apart from revealing the effectiveness of POIs in improving population mapping, this study promises the potential of geospatial big data for mapping other socioeconomic parameters in the future.

AB - Remote sensing image products (e.g. brightness of nighttime lights and land cover/land use types) have been widely used to disaggregate census data to produce gridded population maps for large geographic areas. The advent of the geospatial big data revolution has created additional opportunities to map population distributions at fine resolutions with high accuracy. A considerable proportion of the geospatial data contains semantic information that indicates different categories of human activities occurring at exact geographic locations. Such information is often lacking in remote sensing data. In addition, the remarkable progress in machine learning provides toolkits for demographers to model complex nonlinear correlations between population and heterogeneous geographic covariates. In this study, a typical type of geospatial big data, points-of-interest (POIs), was combined with multi-source remote sensing data in a random forests model to disaggregate the 2010 county-level census population data to 100 × 100 m grids. Compared with the WorldPop population dataset, our population map showed higher accuracy. The root mean square error for population estimates in Beijing, Shanghai, Guangzhou, and Chongqing for this method and WorldPop were 27,829 and 34,193, respectively. The large under-allocation of the population in urban areas and over-allocation in rural areas in the WorldPop dataset was greatly reduced in this new population map. Apart from revealing the effectiveness of POIs in improving population mapping, this study promises the potential of geospatial big data for mapping other socioeconomic parameters in the future.

KW - ITC-ISI-JOURNAL-ARTICLE

UR - https://ezproxy2.utwente.nl/login?url=https://doi.org/10.1016/j.scitotenv.2018.12.276

UR - https://ezproxy2.utwente.nl/login?url=https://library.itc.utwente.nl/login/2019/isi/jia_imp.pdf

U2 - 10.1016/j.scitotenv.2018.12.276

DO - 10.1016/j.scitotenv.2018.12.276

M3 - Article

VL - 658

SP - 936

EP - 946

JO - Science of the total environment

JF - Science of the total environment

SN - 0048-9697

ER -