TY - GEN
T1 - Set-Oriented Mining for Association Rules in Relational Databases
AU - Houtsma, M.A.W.
AU - Swami, Arun
N1 - Conference code: 11
PY - 1995/2/21
Y1 - 1995/2/21
N2 - Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extensions
AB - Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extensions
U2 - 10.1109/ICDE.1995.380413
DO - 10.1109/ICDE.1995.380413
M3 - Conference contribution
SN - 0-8186-69101
SP - 25
EP - 33
BT - Proceedings of the Eleventh International Conference on Data Engineering, ICDE 1995
PB - IEEE
CY - Los Alamitos, CA
T2 - 11th International Conference on Data Engineering, ICDE 1995
Y2 - 6 March 1995 through 10 March 1995
ER -