Determining representative sample size for validation of continuous, large continental remote sensing data

M.L. Blatchford*, C.M. Mannaerts, Y. Zeng

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

50 Downloads (Pure)


The validation of global remote sensing data comprises multiple methods including comparison to field measurements, cross-comparisons and verification of physical consistency. Physical consistency and cross-comparisons are typically assessed for all pixels of the entire product extent, which requires intensive computing. This paper proposes a statistically representative sampling approach to reduce time and efforts associated with validations of remote sensing data having big data volume. A progressive sampling approach, as typically applied in machine learning to train algorithms, combined with two performance measures, was applied to estimate the required sample size. The confidence interval (CI) and maximum entropy probability distribution were used as indicators to represent accuracy. The approach was tested on 8 continental remote sensing-based data products over the Middle East and Africa. Without the consideration of climate classes, a sample size of 10,000–100,000, dependent on the product, met the nominally set CI and entropy indicators. This corresponds to <0.01 % of the total image for the high-resolution images. All continuous datasets showed the same trend of CI and entropy with increasing sample size. The actual evapotranspiration and interception (ETIa) product was further analysed based on climate classes, which increased the sample size required to meet performance requirements, but was still determined to be significantly less than the entire dataset size. The proposed approach can significantly reduce the processing time while still providing a statistically valid representation of a large remote sensing dataset. This can be useful as more high-resolution remote sensing data becomes available.
Original languageEnglish
Article number102235
Pages (from-to)1-11
Number of pages11
JournalInternational Journal of Applied Earth Observation and Geoinformation (JAG)
Early online date21 Sep 2020
Publication statusPublished - Feb 2021


  • progressive sampling
  • Big data
  • UT-Hybrid-D


Dive into the research topics of 'Determining representative sample size for validation of continuous, large continental remote sensing data'. Together they form a unique fingerprint.

Cite this