Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Mar 21:10:e13152.
doi: 10.7717/peerj.13152. eCollection 2022.

Computational bioacoustics with deep learning: a review and roadmap

Affiliations
Review

Computational bioacoustics with deep learning: a review and roadmap

Dan Stowell. PeerJ. .

Abstract

Animal vocalisations and natural soundscapes are fascinating objects of study, and contain valuable evidence about animal behaviours, populations and ecosystems. They are studied in bioacoustics and ecoacoustics, with signal processing and analysis an important component. Computational bioacoustics has accelerated in recent decades due to the growth of affordable digital sound recording devices, and to huge progress in informatics such as big data, signal processing and machine learning. Methods are inherited from the wider field of deep learning, including speech and image processing. However, the tasks, demands and data characteristics are often different from those addressed in speech or music analysis. There remain unsolved problems, and tasks for which evidence is surely present in many acoustic signals, but not yet realised. In this paper I perform a review of the state of the art in deep learning for computational bioacoustics, aiming to clarify key concepts and identify and analyse knowledge gaps. Based on this, I offer a subjective but principled roadmap for computational bioacoustics with deep learning: topics that the community should aim to address, in order to make the most of future developments in AI and informatics, and to use audio data in answering zoological and ecological questions.

Keywords: Acoustics; Animal vocal behaviour; Bioacoustics; Deep learning; Machine learning; Passive acoustic monitoring; Sound.

PubMed Disclaimer

Conflict of interest statement

Dan Stowell is an Academic Editor for PeerJ.

Figures

Figure 1
Figure 1. Three common approaches to implementation of sound detection.
Adapted from Stowell et al. (2016b).

References

    1. Abeßer J. A review of deep learning based methods for acoustic scene classification. Applied Sciences. 2020;10(6):2020. doi: 10.3390/app10062020. - DOI
    1. Acconcjaioco M, Ntalampiras S. One-shot learning for acoustic identification of bird species in non-stationary environments. 2020 25th International Conference on Pattern Recognition (ICPR); Piscataway: IEEE; 2021. pp. 755–762.
    1. Adavanne S, Drossos K, Çakr E, Virtanen T. Stacked convolutional and recurrent neural networks for bird audio detection. Proceedings of EUSIPCO 2017; Special Session on Bird Audio Signal Processing; 2017. pp. 1729–1733.
    1. Adavanne S, Politis A, Virtanen T, Adavanne S, Politis A, Virtanen T. Direction of arrival estimation for multiple sound sources using convolutional recurrent neural network. 2018 26th European Signal Processing Conference (EUSIPCO); Piscataway: IEEE; 2018. pp. 1462–1466.
    1. Adi K, Johnson MT, Osiejuk TS. Acoustic censusing using automatic vocalization classification and identity recognition. Journal of the Acoustical Society of America. 2010;127(2):874–883. doi: 10.1121/1.3273887. - DOI - PubMed

LinkOut - more resources