Clustering geo-data cubes

R. Zurita-Milla, Emma Izquierdo Verdiguier, Serkan Girgin, F. Nattino, Ou Ku, M.W. Grootes, R. Goncalves

Research output: Contribution to conferenceOtherAcademic

122 Downloads (Pure)

Abstract

Earth observation sensors deliver ever-expanding collections of geospatial data at multiple resolutions (spatial, temporal and thematic or spectral). Efficient tools to extract knowledge from these collections are currently missing. Here we present the first release of Clustering geo-Data Cubes (CDC), a Python package to cluster geospatial data cubes by explicitly considering their dimensionality. CDC has three main hallmarks: 1/ it is based on innovative co- and tri-clustering methods that identify groups of pixels with similar spatio-temporal and/or thematic information by simultaneously considering all the dimensions of the data. This overcomes a major limitations of traditional clustering approaches, which analyze each dimension separately; 2/ it provides refined clusters by re-grouping the results obtained from co- and/or tri-clustering. These refined clusters better capture the patterns present in the data and represent a more automatic approach to analyze geospatial data cubes because the number of clusters is automatically chosen via an optimization procedure; and 3/ it allows users to run tasks efficiently by either using NumPy’s threading capabilities or Dask’s parallel computing power. Hence, CDC is a scalable package that can analyze both small and big geospatial data cubes. These hallmarks are showcased through several case studies.
Original languageEnglish
PagesS1-S16
Number of pages16
Publication statusPublished - 4 Sept 2020
EventSpace and Artificial Intelligence 2020 - Online
Duration: 4 Sept 20204 Sept 2020
http://spaceandai.ijs.si/program.html

Conference

ConferenceSpace and Artificial Intelligence 2020
Period4/09/204/09/20
Internet address

Fingerprint

Dive into the research topics of 'Clustering geo-data cubes'. Together they form a unique fingerprint.

Cite this