Using Element Clustering to Increase the Efficiency of XML Schema Matching

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

14 Citations (Scopus)
204 Downloads (Pure)

Abstract

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.
Original languageUndefined
Title of host publicationProceedings of the 22nd International Conference on Data Engineering Workshops (ICDEW 2006)
Place of PublicationLos Alamitos, CA, USA
PublisherIEEE
Pages45
Number of pages10
ISBN (Print)0-7695-2571-7
DOIs
Publication statusPublished - Apr 2006
Event22nd International Conference on Data Engineering, ICDE 2006 - Atlanta, United States
Duration: 3 Apr 20068 Apr 2006
Conference number: 22

Publication series

Name
Number2

Workshop

Workshop22nd International Conference on Data Engineering, ICDE 2006
Abbreviated titleICDE
Country/TerritoryUnited States
CityAtlanta
Period3/04/068/04/06

Keywords

  • IR-58927
  • METIS-238227
  • EWI-7539
  • DB-SDI: SCHEMA AND DATA INTEGRATION
  • DB-PRJBF: BELLFLOWER

Cite this