Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jun 9:2:110.
doi: 10.3389/fpsyg.2011.00110. eCollection 2011.

Toward a neural basis of music perception - a review and updated model

Affiliations

Toward a neural basis of music perception - a review and updated model

Stefan Koelsch. Front Psychol. .

Abstract

Music perception involves acoustic analysis, auditory memory, auditory scene analysis, processing of interval relations, of musical syntax and semantics, and activation of (pre)motor representations of actions. Moreover, music perception potentially elicits emotions, thus giving rise to the modulation of emotional effector systems such as the subjective feeling system, the autonomic nervous system, the hormonal, and the immune system. Building on a previous article (Koelsch and Siebel, 2005), this review presents an updated model of music perception and its neural correlates. The article describes processes involved in music perception, and reports EEG and fMRI studies that inform about the time course of these processes, as well as about where in the brain these processes might be located.

Keywords: EEG; ERAN; brain; fMRI; music; semantics.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Neurocognitive model of music perception. ABR, auditory brainstem response; BA, Brodmann area; ERAN, early right anterior negativity; FFR, frequency-following response; LPC, late positive component; MLC, mid-latency component; MMN, mismatch negativity; RATN, right anterior-temporal negativity; RCZ, rostral cingulate zone; SMA, supplementary motor area. Italic font indicates peak latencies of scalp-recorded evoked potentials.
Figure 2
Figure 2
(A) Examples of chord functions: The chord built on the first scale tone is denoted as the tonic, the chord on the second tone as the supertonic, and the chord on the fifth tone as the dominant. (B) The dominant–tonic progression represents a regular ending of a harmonic sequence (top), the dominant–supertonic progression is less regular and unacceptable as a marker of the end of a harmonic progression (bottom sequence, the arrow indicates the less regular chord). (C) ERPs elicited in a passive listening condition by the final chords of the two sequence types shown in (B). Both sequence types were presented in pseudorandom order equiprobably in all 12 major keys. Brain responses to irregular chords clearly differ from those to regular chords (best to be seen in the black difference wave, regular subtracted from irregular chords). The first difference between the two waveforms is maximal around 200 ms after the onset of the fifth chord (ERAN, indicated by the long arrow) and taken to reflect processes of music-syntactic analysis. The ERAN is followed by an N5 taken to reflect processes of harmonic integration (short arrow). (D) Activation foci (small spheres) reported by functional imaging studies on music-syntactic processing using chord sequence paradigms (Koelsch et al., , ; Maess et al., ; Tillmann et al., 2003) and melodies (Janata et al., 2002a). Large yellow spheres show the mean coordinates of foci (averaged for each hemisphere across studies, coordinates refer to standard stereotaxic space). Reprinted from Koelsch and Siebel (2005).
Figure 3
Figure 3
Tree-structures (according to the GSM) for the sequences shown in Figure 2B, ending on a regular tonic (left), and on a supertonic (right). Dashed line: expected structure (≠: the tonic chord is expected, but a supertonic is presented); dotted lines: a possible solution for the integration of the supertonic. TR(→DR) indicates that the supertonic can still be integrated, e.g., if the expected tonic region is re-structured into dominant region. TR, tonic region; DR, dominant region; SR, subdominant region. Lower-case letters indicate chord functions (functional–structural level), Roman numerals indicate the scale-degree structure, and the bottom row indicates the surface structure in terms of the naming of the chords.
Figure 4
Figure 4
Examples of experimental stimuli used in the studies by Koelsch et al. (2005b) and Steinbeis and Koelsch (2008b). Top: examples of two chord sequences in C major, ending on a regular (upper row) and an irregular chord (lower row, the irregular chord is indicated by the arrow). Bottom: examples of the three different sentence types. Onsets of chords (presented auditorily) and words (presented visually) were synchronous. Reprinted from Steinbeis and Koelsch (2008b).
Figure 5
Figure 5
Grand-average ERPs elicited by the stimuli shown in Figure 4. Participants monitored whether the sentences were (syntactically and semantically) correct or incorrect; in addition, they had to attend to the timber of the chord sequences and to detect infrequently occurring timber deviants. ERPs were recorded on the final chords/words and are shown for the different word conditions (note that only difference waves are shown). (A) The solid (blue) difference wave shows ERAN (indicated by the arrow) and N5 elicited on syntactically and semantically correct words. The dashed (green) difference wave shows ERAN and N5, elicited when chords are presented on morpho-syntactically incorrect (but semantically correct) words. Under the latter condition, the ERAN (but not the N5) is reduced. (B) The solid (blue) difference wave is identical to the solid difference wave of (A), showing the ERAN and the N5 (indicated by the arrow) elicited on syntactically and semantically correct words. The dotted (red) difference wave shows ERAN and N5, elicited when chords are presented on semantically incorrect (but morpho-syntactically correct words). Under the latter condition, the N5 (but not the ERAN) is reduced. (C) shows the direct comparison of the difference waves in which words were syntactically incorrect (dashed, green line) or semantically incorrect (dotted, red line). These ERPs show that the ERAN is influenced by the morpho-syntactic processing of words, but not by the semantic processing of words. By contrast, the N5 is influenced by the semantic processing of words, but not by the morpho-syntactic processing of words. Data from Steinbeis and Koelsch (2008b).
Figure 6
Figure 6
Left: Examples of the four experimental conditions preceding a visually presented target word. Top panel: Prime sentence semantically related to (A), and unrelated to (B) the target word wideness. The diagram on the right shows grand-averaged ERPs elicited by target words after the presentation of semantically related (solid line) and unrelated prime sentences (dotted line), recorded from a central electrode. Unprimed target words elicited a clear N400 component in the ERP (compared to the primed target words). Bottom panel: musical semantically related to (C), and unrelated to (D) the same target word. The diagram on the right shows grand-averaged ERPs elicited by target words after the presentation of semantically related (solid line) and unrelated prime sentence (dotted line). As after the presentation of sentences, unprimed target words elicited a clear N400 component (compared to primed target words). Each trial was presented once, conditions were distributed in random order, but counterbalanced across the experiment. Note that the same target word was used for the four different conditions. Thus, condition-dependent ERP effects elicited by the target words can only be due to the different preceding contexts. Reprinted from Koelsch et al. (2004).
Figure 7
Figure 7
Data from the experiments by Daltrozzo and Schön (2009a), the left panel shows ERPs elicited by target words (primed by short musical excerpts), the right panel shows ERPs elicited by musical excerpts (primed by target words). The thick line represents ERPs elicited by unrelated stimuli, the thin line represents ERPs elicited by related stimuli. Note that the difference in N1 and P2 components is due to the fact that words were presented visually, and musical excerpts auditorily. Both meaningfully unrelated words and meaningfully unrelated musical excerpts elicited N400 potentials.

References

    1. Alain C., Woods D. L., Knight R. T. (1998). A distributed cortical network for auditory sensory memory in humans. Brain Res. 812, 23–3710.1016/S0006-8993(98)00851-8 - DOI - PubMed
    1. Alho K., Tervaniemi M., Huotilainen M., Lavikainen J., Tiitinen H., Ilmoniemi R. J., Knuutila J., Näätänen R. (1996). Processing of complex sounds in the human auditory cortex as revealed by magnetic brain responses. Psychophysiology 33, 369–37510.1111/j.1469-8986.1996.tb01061.x - DOI - PubMed
    1. Alperson P. (1994). What is Music? An Introduction to the Philosophy of Music. Pennsylvania: State University Press
    1. Amunts K., Lenzen M., Friederici A., Schleicher A., Morosan P., Palomero-Gallagher N., Zilles K. (2010). Broca's region: novel organizational principles and multiple receptor mapping. PLoS Biol. 8, e1000489.10.1371/journal.pbio.1000489 - DOI - PMC - PubMed
    1. Bangert M., Altenmüller E. (2003). Mapping perception to action in piano practice: a longitudinal dc-eeg study. BMC Neurosci. 4, 26.10.1186/1471-2202-4-26 - DOI - PMC - PubMed