IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding

H. Tan, T.S. Dillon, F. Hadzic, E. Chang, L. Feng

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

32 Citations (Scopus)

Abstract

Tree mining has recently attracted a lot of interest in areas such as Bioinformatics, XML mining, Web mining, etc. We are mainly concerned with mining frequent induced and embedded subtrees. While more interesting patterns can be obtained when mining embedded subtrees, unfortunately mining such embedding relationships can be very costly. In this paper, we propose an efficient approach to tackle the complexity of mining embedded subtrees by utilizing a novel Embedding List representation, Tree Model Guided enumeration, and introducing the Level of Embedding constraint. Thus, when it is too costly to mine all frequent embedded subtrees, one can decrease the level of embedding constraint gradually up to 1, from which all the obtained frequent subtrees are induced subtrees. Our experiments with both synthetic and real datasets against two known algorithms for mining induced and embedded subtrees, FREQT and TreeMiner, demonstrate the effectiveness and the efficiency of the technique.
Original languageUndefined
Title of host publicationProceedings of the 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2006)
Place of PublicationBerlin
PublisherSpringer
Pages450-461
Number of pages12
ISBN (Print)3-540-33206-5
DOIs
Publication statusPublished - Apr 2006

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Verlag
NumberIEEE Produ
Volume3918
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Keywords

  • DB-DM: DATA MINING
  • IR-63954
  • METIS-238783
  • EWI-9419

Cite this

Tan, H., Dillon, T. S., Hadzic, F., Chang, E., & Feng, L. (2006). IMB3-Miner: Mining Induced/Embedded Subtrees by Constraining the Level of Embedding. In Proceedings of the 10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD 2006) (pp. 450-461). [10.1007/11731139_52] (Lecture Notes in Computer Science; Vol. 3918, No. IEEE Produ). Berlin: Springer. https://doi.org/10.1007/11731139_52