Wireless sensor networks (WSNs) are increasingly being used to monitor various parameters in a wide range of environmental monitoring applications. In many instances, environmental scientists are interested in collecting raw data using long-running queries injected into a WSN for analyzing at a later stage, rather than injecting snap-shot queries containing data-reducing operators (e.g., MIN, MAX, AVG) that aggregate data. Collection of raw data poses a challenge to WSNs as very large amounts of data need to be transported through the network. This not only leads to high levels of energy consumption and thus diminished network lifetime but also results in poor data quality as much of the data may be lost due to the limited bandwidth of present-day sensor nodes. We alleviate this problem by allowing certain nodes in the network to aggregate data by taking advantage of spatial and temporal correlations of various physical parameters and thus eliminating the transmission of redundant data. In this article we present a distributed scheduling algorithm that decides when a particular node should perform this novel type of aggregation. The scheduling algorithm autonomously reassigns schedules when changes in network topology, due to failing or newly added nodes, are detected. Such changes in topology are detected using cross-layer information from the underlying MAC layer. We first present the theoretical performance bounds of our algorithm. We then present simulation results, which indicate a reduction in message transmissions of up to 85% and an increase in network lifetime of up to 92% when compared to collecting raw data. Our algorithm is also capable of completely eliminating dropped messages caused by buffer overflow.