MB3-Miner: efficiently mining eMBedded subTREEs using Tree Model Guided candidate generation

H. Tan, T. Dillon, F. Hadzic, E. Chang, L. Feng

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

18 Citations (Scopus)
47 Downloads (Pure)

Abstract

Tree mining has many useful applications in areas such as Bioinformatics, XML mining, Web mining, etc. In general, most of the formally represented information in these domains is a tree structured form. In this paper we focus on mining frequent embedded subtrees from databases of rooted labeled ordered subtrees. We propose a novel and unique embedding list representation that is suitable for describing embedded subtrees. This representation is completely different from the string-like or conventional adjacency list representation previously utilized for trees. We present the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach previously introduced in [8]. The key characteristic of the TMG approach is that it enumerates fewer candidates by ensuring that only valid candidates that conform to the structural aspects of the data are generated as opposed to the join approach. Our experiments with both synthetic and real-life datasets provide comparisons against one of the state-of-the-art algorithms, TreeMiner [15], and they demonstrate the effectiveness and the efficiency of the technique.
Original languageUndefined
Title of host publicationProceedings of the 1st International Workshop on Mining Complex Data 2005 (MCD 2005)
Place of PublicationHalivax, Nova Scotia, Canada
PublisherIEEE Computer Society Press
Pages103-110
Number of pages8
ISBN (Print)0-9738918-8-2
Publication statusPublished - Nov 2005

Publication series

Name
PublisherIEEE Computer Society Press

Keywords

  • DB-DM: DATA MINING
  • EWI-7338
  • IR-63537
  • METIS-229585

Cite this

Tan, H., Dillon, T., Hadzic, F., Chang, E., & Feng, L. (2005). MB3-Miner: efficiently mining eMBedded subTREEs using Tree Model Guided candidate generation. In Proceedings of the 1st International Workshop on Mining Complex Data 2005 (MCD 2005) (pp. 103-110). Halivax, Nova Scotia, Canada: IEEE Computer Society Press.