Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning

Nicholas Cummins; Alice Baird; Björn W Schuller

doi:10.1016/j.ymeth.2018.07.007

Speech analysis for health: Current state-of-the-art and the increasing impact of deep learning

Methods. 2018 Dec 1:151:41-54. doi: 10.1016/j.ymeth.2018.07.007. Epub 2018 Aug 10.

Authors

Nicholas Cummins¹, Alice Baird², Björn W Schuller³

Affiliations

¹ ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany. Electronic address: nicholas.cummins@ieee.org.
² ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany.
³ ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Germany; GLAM - Group on Language, Audio & Music, Imperial College London, UK. Electronic address: bjoern.schuller@imperial.ac.uk.

PMID: 30099083
DOI: 10.1016/j.ymeth.2018.07.007

Abstract

Due to the complex and intricate nature associated with their production, the acoustic-prosodic properties of a speech signal are modulated with a range of health related effects. There is an active and growing area of machine learning research in this speech and health domain, focusing on developing paradigms to objectively extract and measure such effects. Concurrently, deep learning is transforming intelligent signal analysis, such that machines are now reaching near human capabilities in a range of recognition and analysis tasks. Herein, we review current state-of-the-art approaches with speech-based health detection, placing a particular focus on the impact of deep learning within this domain. Based on this overview, it is evident while that deep learning based solutions be become more present in the literature, it has not had the same overall dominating effect seen in other related fields. In this regard, we suggest some possible research directions aimed at fully leveraging the advantages that deep learning can offer speech-based health detection.

Keywords: Challenges; Deep learning; Health; Paralinguistics; Speech.

Publication types

Research Support, Non-U.S. Gov't
Review

MeSH terms

Acoustics
Deep Learning / trends*
Humans
Neural Networks, Computer
Speech*