Abstract
The main obstacle for providing focused search is the relative opaqueness of search request -- searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can lead to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search.Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model.
Original language | English |
---|---|
Title of host publication | Proceedings of the Dutch-Belgian Information Retrieval Workshop (DIR 2008) |
Editors | E. Hoenkamp, M. De Cock, V. Hoste |
Place of Publication | Enschede |
Publisher | Neslia Paniculata |
Pages | 65-71 |
Number of pages | 7 |
ISBN (Print) | 978-90-5681-282-9 |
Publication status | Published - 14 Apr 2008 |
Event | 8th Dutch-Belgian Information Retrieval Workshop, DIR 2008 - Maastricht, Netherlands Duration: 14 Apr 2008 → 15 Apr 2008 Conference number: 8 |
Conference
Conference | 8th Dutch-Belgian Information Retrieval Workshop, DIR 2008 |
---|---|
Abbreviated title | DIR |
Country/Territory | Netherlands |
City | Maastricht |
Period | 14/04/08 → 15/04/08 |
Keywords
- DB-IR: INFORMATION RETRIEVAL
- IR-64722
- METIS-250952
- EWI-12277