The article aims to model the verbal and prosodic features of emotional expression in interviews to investigate the potential for synergy between scholarly fields that have the narrative as object of study. Using a digital collection of oral history interviews that contains narrative aspects addressing war and violence in Croatia, we analyzed emotional expression through the words spoken, and through the pitch, vocal effort, and pause duration in the speech signal. The findings were correlated with the linear structure of interviews as well as question type. Our analysis indicates that the weight of emotion words for the overall expressed emotion is stronger in later interview parts as well as after open questions and meaning questions. Similar patterns were found for pitch and pause duration, but not for vocal effort. Although the verbal expression of emotions was somewhat correlated to pause duration, the hypothesized correlation between the verbal and nonverbal features was not confirmed. The research also shows that the various expressive layers in the interviews as well as the relations between them are a suited basis for computational modeling that may help track emotional personal narratives in interview collections. Additional research is needed to further develop the framework for the automated analysis of verbal and nonverbal cues to automatically generate annotations to be used for exploring spoken word collections.