Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning

Frank J. Wouda*, Matteo Giuberti, Nina Rudigkeit, Bert-Jan F. van Beijnum, Mannes Poel, Peter H. Veltink

*Corresponding author for this work

    Research output: Contribution to journalArticleAcademicpeer-review

    11 Citations (Scopus)
    87 Downloads (Pure)


    Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: "What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?". We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).

    Original languageEnglish
    Article number3716
    Number of pages17
    JournalSensors (Switzerland)
    Early online date27 Aug 2019
    Publication statusPublished - 1 Sept 2019


    • deep learning
    • human movement
    • inertial motion capture
    • LSTM
    • machine learning
    • neural networks
    • pose estimation
    • reduced sensor set
    • time coherence


    Dive into the research topics of 'Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning'. Together they form a unique fingerprint.

    Cite this