Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning

Frank J. Wouda*, Matteo Giuberti, Nina Rudigkeit, Bert-Jan F. van Beijnum, Mannes Poel, Peter H. Veltink

*Corresponding author for this work

    Research output: Contribution to journalArticleAcademicpeer-review

    10 Downloads (Pure)

    Abstract

    Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: "What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?". We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).

    Original languageEnglish
    Number of pages17
    JournalSensors (Basel, Switzerland)
    Volume19
    DOIs
    Publication statusPublished - 27 Aug 2019

    Fingerprint

    learning
    Learning
    sensors
    Sensors
    markers
    estimating
    Deep learning
    rigid structures
    Neural networks
    resources
    education
    inclusions
    estimates
    configurations
    Research

    Keywords

    • deep learning
    • human movement
    • inertial motion capture
    • LSTM
    • machine learning
    • neural networks
    • pose estimation
    • reduced sensor set
    • time coherence

    Cite this

    @article{ab883fa903ac41fd97443661a9773b49,
    title = "Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors: Deep versus Shallow Learning",
    abstract = "Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: {"}What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?{"}. We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).",
    keywords = "deep learning, human movement, inertial motion capture, LSTM, machine learning, neural networks, pose estimation, reduced sensor set, time coherence",
    author = "Wouda, {Frank J.} and Matteo Giuberti and Nina Rudigkeit and {van Beijnum}, {Bert-Jan F.} and Mannes Poel and Veltink, {Peter H.}",
    year = "2019",
    month = "8",
    day = "27",
    doi = "10.3390/s19173716",
    language = "English",
    volume = "19",
    journal = "Sensors (Switserland)",
    issn = "1424-8220",
    publisher = "Multidisciplinary Digital Publishing Institute",

    }

    Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors : Deep versus Shallow Learning. / Wouda, Frank J.; Giuberti, Matteo; Rudigkeit, Nina; van Beijnum, Bert-Jan F.; Poel, Mannes; Veltink, Peter H.

    In: Sensors (Basel, Switzerland), Vol. 19, 27.08.2019.

    Research output: Contribution to journalArticleAcademicpeer-review

    TY - JOUR

    T1 - Time Coherent Full-Body Poses Estimated Using Only Five Inertial Sensors

    T2 - Deep versus Shallow Learning

    AU - Wouda, Frank J.

    AU - Giuberti, Matteo

    AU - Rudigkeit, Nina

    AU - van Beijnum, Bert-Jan F.

    AU - Poel, Mannes

    AU - Veltink, Peter H.

    PY - 2019/8/27

    Y1 - 2019/8/27

    N2 - Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: "What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?". We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).

    AB - Full-body motion capture typically requires sensors/markers to be placed on each rigid body segment, which results in long setup times and is obtrusive. The number of sensors/markers can be reduced using deep learning or offline methods. However, this requires large training datasets and/or sufficient computational resources. Therefore, we investigate the following research question: "What is the performance of a shallow approach, compared to a deep learning one, for estimating time coherent full-body poses using only five inertial sensors?". We propose to incorporate past/future inertial sensor information into a stacked input vector, which is fed to a shallow neural network for estimating full-body poses. Shallow and deep learning approaches are compared using the same input vector configurations. Additionally, the inclusion of acceleration input is evaluated. The results show that a shallow learning approach can estimate full-body poses with a similar accuracy (~6 cm) to that of a deep learning approach (~7 cm). However, the jerk errors are smaller using the deep learning approach, which can be the effect of explicit recurrent modelling. Furthermore, it is shown that the delay using a shallow learning approach (72 ms) is smaller than that of a deep learning approach (117 ms).

    KW - deep learning

    KW - human movement

    KW - inertial motion capture

    KW - LSTM

    KW - machine learning

    KW - neural networks

    KW - pose estimation

    KW - reduced sensor set

    KW - time coherence

    UR - http://www.scopus.com/inward/record.url?scp=85071625409&partnerID=8YFLogxK

    U2 - 10.3390/s19173716

    DO - 10.3390/s19173716

    M3 - Article

    C2 - 31461958

    AN - SCOPUS:85071625409

    VL - 19

    JO - Sensors (Switserland)

    JF - Sensors (Switserland)

    SN - 1424-8220

    ER -