Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Oct 25;26(43):11023-33.
doi: 10.1523/JNEUROSCI.3466-06.2006.

Probabilistic encoding of vocalizations in macaque ventral lateral prefrontal cortex

Affiliations
Comparative Study

Probabilistic encoding of vocalizations in macaque ventral lateral prefrontal cortex

Bruno B Averbeck et al. J Neurosci. .

Abstract

We examined strategies for classifying macaque vocalizations into their corresponding categories, as well as whether or not there was evidence that prefrontal auditory neurons were related to this process. We found that static estimates of the spectral and temporal contrasts of the calls were not effective features for discriminating among the call classes. A hidden Markov model (HMM), however, was more effective at discriminating among the call classes, reaching a performance of almost 75% correct. Finally, we found that the responses of prefrontal auditory neurons could be predicted more effectively as linear functions of the probabilistic output of the HMM than as linear functions of the spectral features of the calls. This provides evidence that, for call recognition, the macaque auditory system likely performs dynamic processing of vocalizations, and that prefrontal auditory neurons carry a signal related to the output of this processing.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Hidden Markov model. a, Probability of observation given hidden state. Bar graphs show the hidden state probabilities that correspond to each time slice in the spectrogram. Asterisks at the bottom of spectrogram indicate points in the up sweep and down sweep that are similar spectrally. b, Transition probability matrix. This matrix shows the probability of transitioning from one state to another. c, Schematic of model. Circles correspond to hidden (s) or observed (O) variables. Lines indicate statistical dependencies. Hidden states depend only on previous hidden states, and observed variables depend only on hidden states at the same point in time. d, Log filter output and cepstral coefficients for two example calls. HMM was fit to the first half of the DCT coefficients.
Figure 2.
Figure 2.
Characterization of spectral and temporal contrast. a, Spectrograms showing time frequency representation of the sounds. b, Modulation spectra computed by taking 2D Fourier transform of spectrograms. c, Average frequency modulation for each call in a. These are the average across the time axis of the spectrograms. d, Average temporal modulation for each call in a. These are the average across the frequency axis.
Figure 3.
Figure 3.
Classification with time and frequency contrast. a, Classification with average time and frequency contrast is shown by a red line. Performance as a function of the number of factors extracted by NNMF is shown by the blue and green lines. Classification performance was assessed as factors were added to the model. Time factors were added first, followed by frequency factors. The first point in both curves is the average. b, Individual performance of time and frequency factors, as a function of the number of factors. Again, the first point in both curves is the average. Subsequent points were derived by computing entropy on the NNMF factors extracted from the modulation spectra matrix. c, Distribution of samples from each of the 10 categories in the average spectral and temporal contrast space. The large overlap in the distributions indicates that these features do not separate the groups well. d, Example factors for two call types [grunts (GT) and harmonic arches (HA)] extracted by NNMF. The first factor was generally similar to the average. WB, Warble; CO, coo; GK, gekker; GY, girney; SB, shrill bark; CS, copulation scream; SC submissive scream; AG, aggressive call.
Figure 4.
Figure 4.
Classification characteristics of HMM. a, Performance of HMM as a function of number of frequency channels (F; see Materials and Methods) used to prefilter vocalization. b, Performance of HMM as a function of time; the fraction correct of calls as a function of time. This shows that as time evolves, calls are more easily discriminated, and most of the information has been extracted by a few hundred milliseconds. Also shown is the number of calls at least as long as the time indicated. c, Clusters derived from the classification matrix. This plot shows that coos (COs) and warbles (WBs) were often confused, as well as submissive screams (SCs) and copulation screams (CSs). d, Multidimensional scaling representation of the same data. This plot shows the categorical relationships in a continuous space. Abbreviations are the same as in Figure 3.
Figure 5.
Figure 5.
Classification performance of neurons as a function of bin width. a, Average and SEM (n = 122 neurons) classification performance as a function of the bin width. All analyses were performed in the same 300 ms window. SRATE indicates classification performance achieved by computing the spike rate across a response window equal to the length of the call. b, Classification performance as a function of time, at a bin width of 60 ms. Classification starts out relatively low, just above chance. As bins are accumulated in the analysis, the performance increases, as expected.
Figure 6.
Figure 6.
Processing steps in analysis. a, Spectrogram and log-filter representation of example CSc. b, Cepstral representation of example call and STRF estimated across all calls for this neuron. c, Time-probability representation generated by processing with the 10 HMMs and LPRF for this neuron. d, Estimate of response based on STRF (cepstrum) and LPRF (log-likelihood) shown along with average response (PSTH). Abbreviations are the same as in Figure 3.
Figure 7.
Figure 7.
Predicted responses of two example neurons across all call categories. Abbreviations are the same as in Figure 3.
Figure 8.
Figure 8.
Fraction of variance accounted for by linear models, normalized by variance accounted for by PSTH. All analyses were done with twofold cross-validation. Only neurons whose response was significantly predicted by one of the models are shown. Arrows indicate example neurons in Figure 7, a and b.
Figure 9.
Figure 9.
Relation of best model to neuron response properties. a, Performance difference as a function of call category. b, Histogram shows the difference in the proportion of neurons better modeled by the LPRF than the STRF as a function of call category to which the neuron responded most strongly. Neurons which responded strongly to coos, girneys, shrill barks, and warbles were better modeled by the STRF, although the effect was only marginally statistically significant (p = 0.053). Abbreviations are the same as in Figure 3.

References

    1. Aertsen AM, Johannesma PI. The spectro-temporal receptive field. A functional characteristic of auditory neurons. Biol Cybern. 1981;42:133–143. - PubMed
    1. Averbeck BB, Lee D. Effects of noise correlations on information encoding and decoding. J Neurophysiol. 2006;95:3633–3644. - PubMed
    1. Averbeck BB, Romanski LM. Principal and independent components of macaque vocalizations: constructing stimuli to probe high-level sensory processing. J Neurophysiol. 2004;91:2897–2909. - PubMed
    1. Averbeck BB, Crowe DA, Chafee MV, Georgopoulos AP. Neural activity in prefrontal cortex during copying geometrical shapes II. Decoding shape segments from neural ensembles. Exp Brain Res. 2003;150:142–153. - PubMed
    1. Averbeck BB, Sohn JW, Lee D. Activity in prefrontal cortex during dynamic selection of action sequences. Nat Neurosci. 2006;9:276–282. - PubMed

Publication types

LinkOut - more resources