Evapotranspiration (ET) calculated as the residual of catchment water balance (ET WB) has often been used as a benchmark to evaluate satellite-based ET retrievals that use the energy-balance approach (ET EB). However, errors from water balance components will accrue in ET WB, leading to considerable disparities with ET EB. In this study, we set out to investigate whether ET EB from multiple sources (MOD16, GLEAM, PT-JPL, and PT-hybrid) can capture the spatiotemporal variability of ET WB across 53 catchments in central-western Europe with a humid climate. Using ET retrievals from the Budyko framework that accounts for the control of energy demand on water supply and upscaled ET from FLUXCOM as references, we explored the causes of discrepancies between ET WB and ET EB at long-term, annual, and monthly scales. We found that (1) ET EB significantly diverged from ET WB at the mean annual scale (r = 0.35), particularly for energy-limited catchments, but Budyko-simulated ET considering energy limit correlated well with ET EB (r > 0.86); (2) neither ET EB nor upscaled ET can reproduce annual ET WB time series (r < 0.40), and the closure errors in water budgets closely follow excess precipitation beyond energy demand; (3) monthly ET WB exhibited better correspondences with ET EB (r = 0.73), presumably because of similarity in seasonal patterns. Our results demonstrate that errors from precipitation and terrestrial water storage anomalies introduce large uncertainties in ET WB, thereby complicating water balance validation in humid regions across multiple timesteps. To improve the application of ET WB for benchmarking ET EB in humid regions, high-quality input data should be used or – like the Budyko framework – energy constraints should be considered.