Automatic estimation and recognition of poses from video allows for a whole range of applications. The research described here is an important step towards automatic extraction of 3D poses. We describe our research to extract the 2D joint locations of the people in meeting videos. The key point of the research described here is that we generalize over variations in appearance of both people and scene. This results in a robust detection of 2D joint locations. For the detection of different limbs, we employ a number of limb locators. Each of these uses a different set of image features. We evaluate our work on two videos that have been recorded in the meeting context. Our results are promising, yielding an average error of approximately 3-5 cm per joint.
|Place of Publication||Enschede|
|Publisher||Centrum voor Telematica en Informatie Technologie|
|Number of pages||12|
|Publication status||Published - 6 Sep 2006|
|Name||CTIT Technical Report Series|
|Publisher||University of Twente, Centre for Telematics and Information Technology|
- EC Grant Agreement nr.: FP6/506811
Broekhuijsen, J., Poppe, R. W., & Poel, M. (2006). Estimating 2D Upper Body Poses from Monocular Images. (CTIT Technical Report Series; No. 06-55). Enschede: Centrum voor Telematica en Informatie Technologie.