The temporal asynchrony between inputs to different sensory modalities has been shown to be a critical factor influencing the interaction between such inputs. We used scalp-recorded event-related potentials (ERPs) to investigate the effects of attention on the processing of audiovisual multisensory stimuli as the temporal asynchrony between the auditory and visual inputs varied across the audiovisual integration window (i.e., up to 125 ms). Randomized streams of unisensory auditory stimuli, unisensory visual stimuli, and audiovisual stimuli (consisting of the temporally proximal presentation of the visual and auditory stimulus components) were presented centrally while participants attended to either the auditory or the visual modality to detect occasional target stimuli in that modality. ERPs elicited by each of the contributing sensory modalities were extracted by signal processing techniques from the combined ERP waveforms elicited by the multisensory stimuli. This was done for each of the five different 50-ms subranges of stimulus onset asynchrony (SOA: e.g., V precedes A by 125–75 ms, by 75–25 ms, etc.). The extracted ERPs for the visual inputs of the multisensory stimuli were compared among each other and with the ERPs to the unisensory visual control stimuli, separately when attention was directed to the visual or to the auditory modality. The results showed that the attention effects on the right-hemisphere visual P1 was largest when auditory and visual stimuli were temporally aligned. In contrast, the N1 attention effect was smallest at this latency, suggesting that attention may play a role in the processing of the relative temporal alignment of the constituent parts of multisensory stimuli. At longer latencies an occipital selection negativity for the attended versus unattended visual stimuli was also observed, but this effect did not vary as a function of SOA, suggesting that by that latency a stable representation of the auditory and visual stimulus components has been established.