Content-based retrieval has been identified as one of the most challenging problems, requiring a multidisciplinary research among computer vision, information retrieval, artificial intelligence, database, and other fields. In this paper, we address the specific aspect of inferring semantics automatically from raw video data. In particular, we present the Cobra video database management system that supports the integrated use of different knowledge-based methods for mapping low-level features to high-level concepts. We focus on dynamic Bayesian networks and demonstrate how they can be effectively used for fusing the evidence obtained from different media information sources. The approach is validated in the particular domain of Formula 1 race videos. For that specific domain we introduce a robust audio-visual feature extraction scheme and a text recognition and detection method. Based on numerous experiments performed with DBNs, we give some recommendations with respect to the modeling of temporal dependences and different learning algorithms. Finally, we present the experimental results for the detection of excited speech and the extraction of highlights, as well as the advantageous query capabilities of our system.
|Number of pages||24|
|Publication status||Published - Mar 2002|
|Event||Workshop on Multimedia Data Document Engineering, MDDE 2002 - Prague, Czech Republic|
Duration: 24 Mar 2002 → 28 Mar 2002
|Workshop||Workshop on Multimedia Data Document Engineering, MDDE 2002|
|Period||24/03/02 → 28/03/02|
- DB-MMR: MULTIMEDIA RETRIEVAL