In this paper we present our experiences with the cleaning of Web client and proxy usage logs, based on a long-term browsing study with 25 participants. A detailed clickstream log, recorded using a Web intermediary, was combined with a second log of user interface actions, which was captured by a modified Firefox browser for a subset of the participants. The consolidated data from both records revealed many page requests that were not directly related to user actions. For participants who had no ad-filtering system installed, these artifacts made up one third of all transferred Web pages. Three major reasons could be identified: HTML Frames and iFrames, advertisements, and automatic page reloads. The experiences made during the data cleaning process might help other researchers to choose adequate filtering methods for their data.
|Conference||Workshop on Logging Traces of Web Activity: The Mechanics of Data Collection, Edinburgh, Scotland|
|Period||23/05/06 → …|
- HMI-IE: Information Engineering
- HMI-HF: Human Factors