One of the most annoying problems on the Internet is spam. To fight spam, many approaches have been proposed over the years. Most of these approaches involve scanning the entire contents of e-mail messages in an attempt to detect suspicious keywords and patterns. Although such approaches are relatively effective, they also show some disadvantages. Therefore an interesting question is whether it would be possible to effectively detect spam without analyzing the entire contents of e-mail messages. The contribution of this paper is to present an alternative spam detection approach, which relies solely on analyzing the origin (IP address) of e-mail messages, as well as possible links within the e-mail messages to websites (URIs). Compared to analyzing suspicious keywords and patterns, detection and analysis of URIs is relatively simple. The IP addresses and URIs are compared to various kinds of blacklists; a hit increases the probability of the message being spam. Although the idea of using blacklists is well known, the novel idea proposed within this paper is to introduce the concept of ‘bad neighborhoods’. To validate our approach, a prototype has been developed and tested on our university’s mail server. The outcome was compared to SpamAssassin and mail server log files. The result of that comparison was that our prototype showed remarkably good detection capabilities (comparable to SpamAssassin), but puts only a small load on the mail server.
- EC Grant Agreement nr.: FP6/026854