This paper gives an overview of the tools and methods for Cross-Language Information Retrieval (CLIR) that were developed within the Twenty-One project. The tools and methods are evaluated with the TREC CLIR task document collection using Dutch queries on the English document base. The main issue addressed here is an evaluation of two approaches to disambiguation. The underlying question is whether a lot of effort should be put in finding the correct translation for each query term before searching, or whether searching with more than one possible translation leads to better results? The experimental study suggests that in terms of average precision, searching with ambiguities leads to better retrieval performance than searching with disambiguated queries.
|Title of host publication||Language Technology in Multimedia Information Retrieval|
|Subtitle of host publication||Proceedings of the Fourteenth Twente Workshop on Language Technology TWLT-14|
|Place of Publication||Enschede|
|Publisher||University of Twente|
|Number of pages||8|
|Publication status||Published - 1998|
|Event||Twente Workshop on Language Technology, TWLT 14: Language Technology in Multimedia Information Retrieval - University of Twente, Enschede, Netherlands|
Duration: 7 Dec 1998 → 8 Dec 1998
Conference number: 14
|Name||Twente Workshop on Language Technology|
|Workshop||Twente Workshop on Language Technology, TWLT 14|
|Period||7/12/98 → 8/12/98|
- Statistical Machine Translation
- Cross-Language Information Retrieval
Hiemstra, D., & de Jong, F. (1998). Cross-language retrieval in Twenty-One: using one, some or all possible translations? In Language Technology in Multimedia Information Retrieval: Proceedings of the Fourteenth Twente Workshop on Language Technology TWLT-14 (pp. 19-26). (Twente Workshop on Language Technology; No. 14). Enschede: University of Twente.