Most of the previous studies on mining association rules are on mining intratransaction associations, i.e., the associations among items within the same transaction where the notion of the transaction could be the items bought by the same customer, the events happened on the same day, etc. In this study, we break the barrier of transactions and extend the scope of mining association rules from traditional single-dimensional, intratransaction associations to multidimensional, intertransaction associations. An intertransaction association describes the association relationships among different transactions. In a database of stock price information, an example of such an association is "if (company) A's stock goes up on day one, B's stock will go down on day two but go up on day four." In this case, no matter whether we treat company or day as the unit of transaction, the associated items belong to different transactions. Moreover, such an intertransaction association can be extended to associate multiple properties in the same rule, so that multidimensional intertransaction associations can also be defined and discovered. Mining intertransaction associations pose more challenges on efficient processing than mining intratransaction associations because the number of potential association rules becomes extremely large after the boundary of transactions is broken. In this study, we introduce the notion of intertransaction association rule, define its measurements: support and confidence, and develop an efficient algorithm, FITI (an acronym for "First Intra Then Inter"), for mining intertransaction associations, which adopts two major ideas: 1) an intertransaction frequent itemset contains only the frequent itemsets of its corresponding intratransaction counterpart; and 2) a special data structure is built among intratransaction frequent itemsets for efficient mining of intertransaction frequent itemsets. We compare FITI with EH-Apriori, the best algorithm in our previous proposal, and demonstrate a substantial performance gain of FITI over EH-Apriori. Further extensions of the method and its implications are also discussed in the paper.
|Number of pages||14|
|Journal||IEEE transactions on knowledge and data engineering|
|Publication status||Published - Jan 2003|
- DB-DM: DATA MINING