In this paper we present a systematic analysis of document retrieval using unstructured and structured queries within the score region algebra (SRA) structured retrieval framework. The behavior of di®erent retrieval models, namely Boolean, tf.idf, GPX, language models, and Okapi, is tested using the transparent SRA framework in our three-level structured retrieval system called TIJAH. The retrieval models are implemented along four elementary retrieval aspects: element and term selection, element score computation, score combination, and score propagation. The analysis is performed on a numerous experiments evaluated on TREC and CLEF collections, using manually generated unstructured and structured queries. Unstructured queries range from the short title queries to long title + description + narrative queries. For generating structured queries we exploit the knowledge of the document structure and the content used to semantically describe or classify documents. We show that such structured information can be utilized in retrieval engines to give more precise answers to user queries then when using unstructured queries.
|Place of Publication||Enschede|
|Publisher||Centre for Telematics and Information Technology (CTIT)|
|Number of pages||10|
|Publication status||Published - 1 Oct 2006|
|Name||CTIT Technical Report Series|
|Publisher||Centre for Telematics and Information Technology, University of Twente|
- DB-XMLIR: XML INFORMATION RETRIEVAL
Mihajlovic, V., Hiemstra, D., Blok, H. E., & Apers, P. M. G. (2006). Exploiting Query Structure and Document Structure to Improve Document Retrieval Effectiveness. (CTIT Technical Report Series; No. 06-57). Enschede: Centre for Telematics and Information Technology (CTIT).