Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021;2021(1):30.
doi: 10.1186/s13636-021-00214-7. Epub 2021 Jul 28.

Musical note onset detection based on a spectral sparsity measure

Affiliations

Musical note onset detection based on a spectral sparsity measure

Mina Mounir et al. EURASIP J Audio Speech Music Process. 2021.

Abstract

If music is the language of the universe, musical note onsets may be the syllables for this language. Not only do note onsets define the temporal pattern of a musical piece, but their time-frequency characteristics also contain rich information about the identity of the musical instrument producing the notes. Note onset detection (NOD) is the basic component for many music information retrieval tasks and has attracted significant interest in audio signal processing research. In this paper, we propose an NOD method based on a novel feature coined as Normalized Identification of Note Onset based on Spectral Sparsity (NINOS2). The NINOS2 feature can be thought of as a spectral sparsity measure, aiming to exploit the difference in spectral sparsity between the different parts of a musical note. This spectral structure is revealed when focusing on low-magnitude spectral components that are traditionally filtered out when computing note onset features. We present an extensive set of NOD simulation results covering a wide range of instruments, playing styles, and mixing options. The proposed algorithm consistently outperforms the baseline Logarithmic Spectral Flux (LSF) feature for the most difficult group of instruments which are the sustained-strings instruments. It also shows better performance for challenging scenarios including polyphonic music and vibrato performances.

Keywords: Music information retrieval; Music signal analysis; Music signal processing; Note onset detection; Sparsity.

PubMed Disclaimer

Conflict of interest statement

Competing interestsThe authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Solution scheme for NOD. An example of the output of the different steps is shown when NOD is applied on an input music excerpt
Fig. 2
Fig. 2
Temporal variation of signal energy per frequency bin and of spectral sparsity for electric guitar (major seventh stopped) excerpt. Note onsets are indicated by vertical lines. (Top and middle) Low- and high-energy log-magnitude spectrograms. (Bottom) Low- and high-energy STFT log-magnitude coefficient vector 1-norm variation
Fig. 3
Fig. 3
Temporal variation of signal energy per frequency bin and of spectral sparsity for cello (non-vibrato) excerpt. Note onsets are indicated by vertical lines. (Top and middle) Low- and high-energy log-magnitude spectrograms. (Bottom) Low- and high-energy STFT log-magnitude coefficient vector 1-norm variation
Fig. 4
Fig. 4
Temporal variation of signal energy per frequency bin and of spectral sparsity for trumpet (Bach) excerpt. Note onsets are indicated by vertical lines. (Top and middle) Low- and high-energy log-magnitude spectrograms. (Bottom) Low- and high-energy STFT log-magnitude coefficient vector 1-norm variation
Fig. 5
Fig. 5
2-D unit-ball illustration of the relation between the 2-norm and 4-norm
Fig. 6
Fig. 6
Comparison of ODFs and peak-picking results for electric guitar (major seventh stopped) excerpt. Vertical lines represent onset detection windows indicating ground-truth onsets. Circles are used to mark true positives in the peak-picking results, while false positives are marked with crosses
Fig. 7
Fig. 7
Comparison of ODFs and peak-picking results for cello (non-vibrato) excerpt. Vertical lines represent onset detection windows indicating ground-truth onsets. Circles are used to mark true positives in the peak-picking results, while false positives are marked with crosses
Fig. 8
Fig. 8
Comparison of ODFs and peak-picking results for trumpet (Bach) excerpt. Vertical lines represent onset detection windows indicating ground-truth onsets. Circles are used to mark true positives in the peak-picking results, while false positives are marked with crosses

References

    1. Masri P. Computer modelling of sound for transformation and synthesis of musical signals. PhD thesis. UK: University of Bristol; 1996.
    1. M. Mounir, Note onset detection using sparse over-complete representation of musical signals. Master’s thesis. University of Lugano, Advanced Learning and Research Institute, Lugano, Switzerland (2013). ftp://ftp.esat.kuleuven.be/stadius/mshehata/mscthesis/mshehatamsc.pdf.
    1. Bello J. P., Daudet L., Abdallah S., Duxbury C., Davies M., Sandler M. B. A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 2005;13(5):1035–1047. doi: 10.1109/TSA.2005.851998. - DOI
    1. Leveau P., Daudet L. Proc. 5th Int. Symp. on Music Information Retrieval (ISMIR ’04) Barcelona: International Society for Music Information Retrieval (ISMIR); 2004. Methodology and tools for the evaluation of automatic onset detection algorithms in music; pp. 72–75.
    1. Benetos E., Dixon S. Proc. 2011 IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP ’11) Prague: Institute of Electrical and Electronics Engineers (IEEE); 2011. Polyphonic music transcription using note onset and offset detection; pp. 37–40.

LinkOut - more resources