Abstract
In this paper, ongoing work concerning the language modelling and lexicon optimization of a Dutch speech recognition system for Spoken Document Retrieval is described: the collection and normalization of a training data set and the optimization of our recognition lexicon. Effects on lexical coverage of the amount of training data, of decompounding compound words and of different selection methods for proper names and acronyms are discussed.
Original language | English |
---|---|
Title of host publication | Proceedings of Eurospeech 2001 - Scandinavia |
Editors | P. Dalsgaard, B. Lindberg, H. Benner |
Pages | 1085-1088 |
Publication status | Published - 2001 |
Event | Conferentie in Aalborg, Denmark: Proceedings of Eurospeech 2001 - Scandinavia - Duration: 1 Jan 1900 → … |
Conference
Conference | Conferentie in Aalborg, Denmark |
---|---|
Period | 1/01/00 → … |