Predictive mapping of urban air pollution using apache spark on a hadoop cluster

Marjan Asgari, M. Farnaghi, Zeinab Ghaemi

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

12 Citations (Scopus)

Abstract

Air pollution is one of the major environmental problems in the industrial and populated cities. Predictive mapping of urban air pollution and sharing the generated maps with the public and city officials have positive impacts on society and environment. This article presents a solution based on distributed processing concepts to generate predictive map of air pollution for the next 24 hours. Apache Hadoop has been utilized as the underlying framework to form a cluster of processing machines. In order to improve the processing speed along with required machine learning functionalities, Apache Spark has been employed on the Hadoop cluster. The solution enables us to efficiently predict air quality classes on monitoring stations of Tehran, the capital of Iran for the next 24 hours. Using Inverse distance weighting (IDW) method, the predictive map of air quality classes is generated afterward for the whole city. The results showed that the proposed approach can achieve a reasonable speed in processing of big spatial data along with horizontal scalability..

Original languageEnglish
Title of host publication2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017
PublisherAssociation for Computing Machinery (ACM)
Pages89-93
Number of pages5
ISBN (Electronic)9781450353434
DOIs
Publication statusPublished - 17 Sep 2017
Externally publishedYes
Event2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017 - London, United Kingdom
Duration: 17 Sep 201719 Sep 2017

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2017 International Conference on Cloud and Big Data Computing, ICCBDC 2017
CountryUnited Kingdom
CityLondon
Period17/09/1719/09/17

Keywords

  • Air pollution
  • Big spatial data
  • Distributed processing
  • Hadoop
  • Predictive mapping
  • Spark
  • ITC-CV

Fingerprint

Dive into the research topics of 'Predictive mapping of urban air pollution using apache spark on a hadoop cluster'. Together they form a unique fingerprint.

Cite this