Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 23;19(10):6311.
doi: 10.3390/ijerph19106311.

Deep Learning for Infant Cry Recognition

Affiliations

Deep Learning for Infant Cry Recognition

Yun-Chia Liang et al. Int J Environ Res Public Health. .

Abstract

Recognizing why an infant cries is challenging as babies cannot communicate verbally with others to express their wishes or needs. This leads to difficulties for parents in identifying the needs and the health of their infants. This study used deep learning (DL) algorithms such as the convolutional neural network (CNN) and long short-term memory (LSTM) to recognize infants' necessities such as hunger/thirst, need for a diaper change, emotional needs (e.g., need for touch/holding), and pain caused by medical treatment (e.g., injection). The classical artificial neural network (ANN) was also used for comparison. The inputs of ANN, CNN, and LSTM were the features extracted from 1607 10 s audio recordings of infants using mel-frequency cepstral coefficients (MFCC). Results showed that CNN and LSTM both provided decent performance, around 95% in accuracy, precision, and recall, in differentiating healthy and sick infants. For recognizing infants' specific needs, CNN reached up to 60% accuracy, outperforming LSTM and ANN in almost all measures. These results could be applied as indicators for future applications to help parents understand their infant's condition and needs.

Keywords: convolutional neuron network; deep learning; infant cry recognition; long short-term memory.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Block diagram of MFCC.
Figure 2
Figure 2
Artificial neural network structure for two-class classification [12].
Figure 3
Figure 3
An illustrative example of the one-dimensional CNN structure [14].
Figure 4
Figure 4
An illustrative example of the LSTM structure [14].
Figure 5
Figure 5
An example of the data collection device and the sample infant.

References

    1. Adachi T., Murai N., Okada H., Nihei Y. Acoustic properties of infant cries and maternal perception. Ohoku Psychol. Folia. 1985;44:51–58.
    1. Patil H.A. Cry baby: Using spectrographic analysis. In: Neustein A., editor. Advances in Speech Recognition. Springer; New York, NY, USA: 2010. pp. 323–348.
    1. Yong B.F., Ting H., Ng K. World Congress on Medical. Springer; Prague, Czech: 2019. Baby cry recognition using deep neural networks; pp. 809–816.
    1. Reyes-Galaviz O.F., Arch-Tirado E. International Conference on Computers Helping People with Special Needs. Research Gate; Paris, France: 2004. Classification of infant crying to identify pathologies in recently born babies with ANFIS; pp. 408–415.
    1. Garcia J., García C. European Symposium on Artificial Neural Networks. d-side publi; Bruges, Belgium: 2003. Clasification of infant cry ising a scaled conjugate gradient neural; pp. 349–354.

Publication types

LinkOut - more resources