Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 May 4:8:666.
doi: 10.3389/fpsyg.2017.00666. eCollection 2017.

A Dynamical Model of Pitch Memory Provides an Improved Basis for Implied Harmony Estimation

Affiliations

A Dynamical Model of Pitch Memory Provides an Improved Basis for Implied Harmony Estimation

Ji Chul Kim. Front Psychol. .

Abstract

Tonal melody can imply vertical harmony through a sequence of tones. Current methods for automatic chord estimation commonly use chroma-based features extracted from audio signals. However, the implied harmony of unaccompanied melodies can be difficult to estimate on the basis of chroma content in the presence of frequent nonchord tones. Here we present a novel approach to automatic chord estimation based on the human perception of pitch sequences. We use cohesion and inhibition between pitches in auditory short-term memory to differentiate chord tones and nonchord tones in tonal melodies. We model short-term pitch memory as a gradient frequency neural network, which is a biologically realistic model of auditory neural processing. The model is a dynamical system consisting of a network of tonotopically tuned nonlinear oscillators driven by audio signals. The oscillators interact with each other through nonlinear resonance and lateral inhibition, and the pattern of oscillatory traces emerging from the interactions is taken as a measure of pitch salience. We test the model with a collection of unaccompanied tonal melodies to evaluate it as a feature extractor for chord estimation. We show that chord tones are selectively enhanced in the response of the model, thereby increasing the accuracy of implied harmony estimation. We also find that, like other existing features for chord estimation, the performance of the model can be improved by using segmented input signals. We discuss possible ways to expand the present model into a full chord estimation system within the dynamical systems framework.

Keywords: automatic chord estimation; dynamical system; gradient frequency neural network; implied harmony; neural oscillation; pitch memory; tonal melody.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic of the dynamical model of short-term pitch memory. The colors and line widths used for different connection types are only for visual distinction and do not indicate their relative strengths.
Figure 2
Figure 2
The model's response to the opening of J. S. Bach's Violin Partita No. 3, BWV 1006, Prelude: (A) the musical score and (B) the amplitudes of Layer 1 and Layer 2 oscillators and stimulus tones. The stimulus (an audio signal) is depicted in a piano-roll representation. High-amplitude oscillations in Layer 2 (depicted with dark colors) are considered active pitch traces in auditory memory.
Figure 3
Figure 3
Comparison of the trace prolongations for chord tones and nonchord tones in the Mozart melodies. Mean note duration, mean trace duration and mean trace prolongation (i.e., trace duration − note duration) are shown. The error bars indicate standard errors.
Figure 4
Figure 4
Oscillatory traces formed in Layer 2 in response to the first two phrases (the first 15 chord spans) in Mozart Piano Sonata No. 11, K. 331, Theme. Vertical red lines demarcate chord spans, and horizontal lines indicate the pitches belonging to the chords. Chord annotations are based on both the melody and the accompaniment.
Figure 5
Figure 5
Difference between chord pitches and nonchord pitches in total trace duration and total note duration within each chord span in Mozart Piano Sonata, K. 331, Theme. The top panel shows a single simulation run with the entire melody, and the bottom panel shows simulations for individual chord spans run separately. CT and NCT denote chord tones and nonchord tones.
Figure 6
Figure 6
Mean difference between chord pitches and nonchord pitches in note duration, trace duration in single simulations and trace duration in segmented simulations, averaged over all chord spans in the seven Mozart melodies. The error bars indicate standard errors. CT and NCT denote chord tones and nonchord tones.

Similar articles

Cited by

References

    1. Apel W. (1969). The Harvard Dictionary of Music, 2nd Edn. Cambridge, MA: Belknap Press.
    1. Bartsch M. A., Wakefield G. H. (2001). To catch a chorus: using chroma-based representations for audio thumbnailing, in Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (New Paltz, NY: IEEE; ), 15–18.
    1. Bello J. P., Pickens J. (2005). A robust mid-level representation for harmonic content in music signals, in Proceedings of the 6th International Conference on Music Information Retrieval (London: Queen Mary, University of London; ), 304–311.
    1. Bharucha J. J. (1984). Anchoring effects in music: the resolution of dissonance. Cogn. Psychol. 16, 485–518. 10.1016/0010-0285(84)90018-5 - DOI
    1. Bharucha J. J. (1996). Melodic anchoring. Music Percept. 13, 383–400. 10.2307/40286176 - DOI

LinkOut - more resources