. 2023 Oct 4;19(10):e1011541.

doi: 10.1371/journal.pcbi.1011541. eCollection 2023 Oct.

Adaptive representations of sound for automatic insect recognition

Marius Faiß^{1

2}, Dan Stowell^{1

3}

Affiliations

¹ Naturalis Biodiversity Center, Leiden, The Netherlands.
² Leiden University, Leiden, The Netherlands.
³ Department of Cognitive Science and AI, Tilburg University, Tilburg, The Netherlands.

PMID: 37792895
PMCID: PMC10578591
DOI: 10.1371/journal.pcbi.1011541

Adaptive representations of sound for automatic insect recognition

Marius Faiß et al. PLoS Comput Biol. 2023.

. 2023 Oct 4;19(10):e1011541.

doi: 10.1371/journal.pcbi.1011541. eCollection 2023 Oct.

Authors

Marius Faiß^{1

2}, Dan Stowell^{1

3}

Affiliations

¹ Naturalis Biodiversity Center, Leiden, The Netherlands.
² Leiden University, Leiden, The Netherlands.
³ Department of Cognitive Science and AI, Tilburg University, Tilburg, The Netherlands.

PMID: 37792895
PMCID: PMC10578591
DOI: 10.1371/journal.pcbi.1011541

Abstract

Insect population numbers and biodiversity have been rapidly declining with time, and monitoring these trends has become increasingly important for conservation measures to be effectively implemented. But monitoring methods are often invasive, time and resource intense, and prone to various biases. Many insect species produce characteristic sounds that can easily be detected and recorded without large cost or effort. Using deep learning methods, insect sounds from field recordings could be automatically detected and classified to monitor biodiversity and species distribution ranges. We implement this using recently published datasets of insect sounds (up to 66 species of Orthoptera and Cicadidae) and machine learning methods and evaluate their potential for acoustic insect monitoring. We compare the performance of the conventional spectrogram-based audio representation against LEAF, a new adaptive and waveform-based frontend. LEAF achieved better classification performance than the mel-spectrogram frontend by adapting its feature extraction parameters during training. This result is encouraging for future implementations of deep learning technology for automatic insect sound recognition, especially as larger datasets become available.

Copyright: © 2023 Faiß, Stowell. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

PubMed Disclaimer

Conflict of interest statement

I have read the journal’s policy and the authors of this manuscript have the following competing interests: DS is an Academic Editor for PLOS Computational Biology.

Figures

**Fig 1. Two spectrograms of the same recording of *Gryllus campestris*.**
Spectrogram A displays the frequency axis linearly in Hz. Spectrogram B uses the mel frequency scale, which compresses the frequency axis to show higher resolution in lower frequency bands than in higher bands, mimicking the human perception of frequency. Both spectrograms display the same spectrum of frequencies. Due to the mostly high-frequency information and empty low frequencies in this recording, the mel spectrogram B obscures a large amount of information compared to the linear spectrogram A.

**Fig 2. Example of the data augmentation workflow used on the training set (InsectSet47 and InsectSet66). Noise is added at a randomized signal-to-noise ratio and frequency distribution.**
Then an impulse response from an outdoor location is applied at a randomized mix ratio.

**Fig 3. Classification outcome for all 32 species in the test set using the best run of the mel frontend performing at 67% classification accuracy.**
The vertical axis displays the true labels of the files, the horizontal axis shows the predicted labels, sorted alphabetically. Classifications within the two biggest genera *Platypleura* (green) and *Myopsalta* (red) are highlighted for comparison to the LEAF confusion matrix.

**Fig 4. Classification outcome for all 32 species in the test set using the best run of the LEAF frontend performing at 78% classification accuracy.**
The vertical axis displays the true labels of the files, the horizontal axis shows the predicted labels, sorted alphabetically. Classifications within the two biggest genera *Platypleura* (green) and *Myopsalta* (red) are highlighted for comparison to the mel confusion matrix.

**Fig 5. Center frequencies of all 64 filters used in the best performing LEAF run on InsectSet32.**
Plots A and D show the initialization curve before training, which is based on the mel scale. Plots B and E show the deviation of each filter from their initialized position after training. Plots C and F show the filters sorted by center frequency, and demonstrate the overall coverage of the frequency range, but do not represent the real ordering in the LEAF representations. Violin plots show the density of filters over the frequency spectrum, the orange line shows the initialization curve for comparison.

See this image and copyright information in PMC

References

1. Song H, Béthoux O, Shin S, Donath A, Letsch H, Liu S, et al. Phylogenomic analysis sheds light on the evolutionary pathways towards acoustic communication in Orthoptera. Nat Commun. 2020;11: 4939. doi: 10.1038/s41467-020-18739-4 - DOI - PMC - PubMed
1. Young D, Bennet-Clark HC. The Role of the Tymbal in Cicada Sound Production. The Journal of Experimental Biology. 1995; 1001–1019. doi: 10.1242/jeb.198.4.1001 - DOI - PubMed
1. Luo C, Wei C, Nansen C. How Do “Mute” Cicadas Produce Their Calling Songs? Machado RB, editor.PLoS ONE. 2015;10: e0118554. doi: 10.1371/journal.pone.0118554 - DOI - PMC - PubMed
1. Bennet-Clark HC. How Cicadas Make their Noise. Sci Am. 1998;278: 58–61. doi: 10.1038/scientificamerican0598-58 - DOI
1. Heller K-G, Baker E, Ingrisch S, Korsunovskaya O, Liu C-X, Riede K, et al. Bioacoustics and systematics of Mecopoda (and related forms) from South East Asia and adjacent areas (Orthoptera, Tettigonioidea, Mecopodinae) including some chromosome data. Zootaxa. 2021;5005: 101–144. doi: 10.11646/zootaxa.5005.2.1 - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Adaptive representations of sound for automatic insect recognition

Affiliations

Adaptive representations of sound for automatic insect recognition

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources