Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2016 Feb;91(1):13-52.
doi: 10.1111/brv.12160. Epub 2014 Nov 26.

Acoustic sequences in non-human animals: a tutorial review and prospectus

Affiliations
Review

Acoustic sequences in non-human animals: a tutorial review and prospectus

Arik Kershenbaum et al. Biol Rev Camb Philos Soc. 2016 Feb.

Abstract

Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well-known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise - let alone understand - the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near-future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, 'Analysing vocal sequences in animals'. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial-style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.

Keywords: Markov model; acoustic communication; information; information theory; machine learning; meaning; network analysis; sequence analysis; vocalisation.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Flowchart showing a typical analysis of animal acoustic sequences. In this review, we discuss identifying units, characterising sequences, and identifying meaning.
Fig. 2
Fig. 2
Examples of the different criteria for dividing a spectrogram into units. (A) Separating units by silent gaps is probably the most commonly used criterion. (B) An acoustic signal may change its properties at a certain time, without the presence of a silent “gap”, for instance becoming harmonic or noisy. (C) A series of similar sounds may be grouped together as a single unit, regardless of silent gaps between them; a chirp sequence is labelled as C. (D) A complex hierarchical structure to the sequence, combining sounds that might otherwise be considered fundamental units.
Fig. 3
Fig. 3
Example of cepstral processing of a grey wolf Canis lupis howl (below 6 kHz) and crickets chirping (above 6.5 kHz). Recording was sampled at Fs = 16 kHz, 8 bit quantization. (A) Standard spectrogram analysed with a 15 ms Blackman-Harris window. (B) Plot of transform to cepstral domain. Lower quefrencies are related to vocal tract information. F0 can be determined from the “cepstral bump” apparent between quefrencies 25–45 and can be derived by Fs/quefrency. (C) Cepstrum (inset) of the frame indicated by an arrow in A(2.5 s) along with reconstructions of the spectrum created from truncated cepstral sequences. Fidelity improves as the number of cepstra are increased.
Fig. 4
Fig. 4
Perceptual constraints for the definition of sequence units. (A) Perceptual binding, where two discrete acoustic elements may be perceived by the receiver either as a single element, or as two separate ones. (B) Categorical perception, where continuous variation in acoustic signals may be interpreted by the receiver as discrete categories. (C) Spectrotemporal constraints, where if the receiver cannot distinguish small differences in time or frequency, discrete elements may be interpreted as joined.
Fig. 5
Fig. 5
Graphical representation of the process of selecting an appropriate unit definition. (A) Determine what is known about the production mechanism of the signalling individual, from the hierarchy of production mechanisms, and their spectrotemporal differences. (B) Determine what is known about the perception abilities of the receiver (vertical axis), and how this may modify the production characteristics of the sound (horizontal axis). (C) Choose a classification method suitable for the modified acoustic characteristics (√ indicates suitable, × indicates unsuitable, ~ indicates neutral).
Fig. 6
Fig. 6
Different ways that units can be combined to encode information in a sequence.
Fig. 7
Fig. 7
Flowchart suggesting possible paths for the analysis of sequences of acoustic units. Exploratory data analysis is conducted on the units or timing information using techniques such as histograms, networks, or low-order Markov models. Preliminary embedding paradigm hypotheses are formed based on observations. Depending upon the hypothesised embedding paradigm, various analysis techniques are suggested. HMM, hidden Markov model.
Fig. 8
Fig. 8
State transition diagram equivalent to a 2nd order Markov model and trigram model (N=3) for a sequence containing As and Bs.
Fig. 9
Fig. 9
State transition diagram of a two-state (X, Y) hidden Markov model capable of producing sequences of acoustic units A and B. When in state X, acoustic units emission of signals A and B are equally likely Pe(A|X)= Pe(B|X)=0.5, and when in state Y, acoustic unit A is much more likely Pe(A|Y)=0.9 than B Pe(B| Y )=0.1. Transitioning from state X to state Y occurs with probability Pt(XY)=0.6, whereas from state Y to state X with probability Pt(YX)=0.3.
Fig. 10
Fig. 10
Simple networks constructed from the sequence of acoustic units A, B and C. The undirected binary network (left) simply indicates that A, B, and C are associated with one another without any information about transition direction. The directed binary network (centre) adds ordering information, for example that C cannot follow A. The weighted directed network (right) show the probabilities of the transitions between units based on a bigram model.
Fig. 11
Fig. 11
Grammar (rewrite rules) for approximating the sequence of acoustic units produced by Eastern Pacific blue whales Balaenoptera musculus. There are three acoustic units, a, b, and d (Oleson et al., 2007), and the sequence begins with a start symbol S. Individual b or d calls may be produced, or song, which consists of repeated sequences of an a call followed by one or more b calls. The symbol | indicates a choice, and ε, the empty string, indicates that the rule is no longer used. A derivation is shown for the song abbab. Underlined variables indicate those to be replaced. Grammar produced with contributions from Ana Širović (Scripps Institution of Oceanography).
Fig. 12
Fig. 12
The classes of formal grammars known as the Chomsky hierarchy (Chomsky, 1957). Each class is a generalisation of the class it encloses, and is more complex than the enclosed classes. Image publicly available under the Creative Commons Attribution-Share Alike 3.0 Unported license (https://commons.wikimedia.org/wiki/File:Wiki_inf_chomskeho_hierarchia.jpg).

References

    1. Adam O, Cazau D, Gandilhon N, Fabre B, Laitman JT, Reidenberg JS. New acoustic model for humpback whale sound production. Applied Acoustics. 2013;74:1182–1190.
    1. Adami C. What is complexity? BioEssays. 2002;24:1085–1094. - PubMed
    1. Adams DC, Anthony CD. Using randomization techniques to analyse behavioural data. Animal Behaviour. 1996;51:733–738.
    1. Akçay Ç, Tom ME, Campbell SE, Beecher MD. Song type matching is an honest early threat signal in a hierarchical animal communication system. Proceedings of the Royal Society B: Biological Sciences. 2013:280. - PMC - PubMed
    1. Anderson TW, Goodman LA. Statistical inference about Markov chains. The Annals of Mathematical Statistics. 1957;28:89–110.

Publication types