. 2025 Jan 21;57(2):61.

doi: 10.3758/s13428-024-02577-z.

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Jane E Bairnsfather¹, Miriam A Mosing^{2

3

4}, Margaret S Osborne^{2

5}, Sarah J Wilson²

Affiliations

¹ Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, Australia. jane.bairnsfather@gmail.com.
² Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, Australia.
³ Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden.
⁴ Department of Cognitive Neuropsychology, Max Planck Institute for Empirical Aesthetics, Frankfurt Am Main, Germany.
⁵ Melbourne Conservatorium of Music, University of Melbourne, Melbourne, Australia.

PMID: 39838215
PMCID: PMC11750914
DOI: 10.3758/s13428-024-02577-z

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Jane E Bairnsfather et al. Behav Res Methods. 2025.

. 2025 Jan 21;57(2):61.

doi: 10.3758/s13428-024-02577-z.

Authors

Jane E Bairnsfather¹, Miriam A Mosing^{2

3

4}, Margaret S Osborne^{2

5}, Sarah J Wilson²

Affiliations

¹ Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, Australia. jane.bairnsfather@gmail.com.
² Melbourne School of Psychological Sciences, University of Melbourne, Melbourne, Australia.
³ Department of Neuroscience, Karolinska Institutet, Stockholm, Sweden.
⁴ Department of Cognitive Neuropsychology, Max Planck Institute for Empirical Aesthetics, Frankfurt Am Main, Germany.
⁵ Melbourne Conservatorium of Music, University of Melbourne, Melbourne, Australia.

PMID: 39838215
PMCID: PMC11750914
DOI: 10.3758/s13428-024-02577-z

Abstract

Despite extensive research on absolute pitch (AP), there remains no gold-standard task to measure its presence or extent. This systematic review investigated the methods of pitch-naming tasks for the classification of individuals with AP and examined how our understanding of the AP phenotype is affected by variability in the tasks used to measure it. Data extracted from 160 studies (N = 23,221 participants) included (i) the definition of AP, (ii) task characteristics, (iii) scoring method, and (iv) participant scores. While there was near-universal agreement (99%) in the conceptual definition of AP, task characteristics such as stimulus range and timbre varied greatly. Ninety-five studies (59%) specified a pitch-naming accuracy threshold for AP classification, which ranged from 20 to 100% (mean = 77%, SD = 20), with additional variability introduced by 31 studies that assigned credit to semitone errors. When examining participants' performance rather than predetermined thresholds, mean task accuracy (not including semitone errors) was 85.9% (SD = 10.8) for AP participants and 17.0% (SD = 10.5) for non-AP participants. This review shows that the characterisation of the AP phenotype varies based on methodological choices in tasks and scoring, limiting the generalisability of individual studies. To promote a more coherent approach to AP phenotyping, recommendations about the characteristics of a gold-standard pitch-naming task are provided based on the review findings. Future work should also use data-driven techniques to characterise phenotypic variability to support the development of a taxonomy of AP phenotypes to advance our understanding of its mechanisms and genetic basis.

Keywords: Absolute pitch; Heritability; Methods; Phenotype.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: The authors declare no conflicts of interest pertaining to this manuscript. Ethics approval: As a review of published data, no ethics approval was required. Consent to participate: Not applicable. Consent for publication: Not applicable.

Figures

**Fig. 1**
Search strategy for 2019 search. *Note.* Further details for removal of studies not meeting inclusion criteria can be found in Table 2

**Fig. 2**
Publication trees showing relationships among tasks and their replication. *Note.* Studies are linked by *arrows*, with the arrowhead pointing towards the study that cites the previous study’s task. *Dotted lines* are for readability and are used the same way as *solid lines*. (A) Tasks used in multiple subsequent studies. Tasks that are direct replications of their parent task are in plain text, while those that are adaptations are in *grey*. (B) Tasks derived from reviews. No replication/adaptation distinction is made here as the source papers do not include specific tasks

**Fig. 3**
Accuracy thresholds used across studies. *Note.* Scores classified into AP, non-AP, and intermediate groups are shown by *shaded bars*. *Green bars* refer to AP performance, *red bars* to non-AP performance, and *teal/blue/orange* refer to intermediate groups. Lighter versions of the colours (e.g., Gruhn et al., 2018) indicate studies for which semitone error credit was applied. Hou et al., (2014, 2021, 2023) include a cross-hatching over the AP group to denote that only white-key notes were used in this task. *Diagonal fill* indicates that no non-AP groups completed the pitch-naming task in these studies. *Asterisks* next to study names indicate that additional metrics were used beyond these thresholds to determine group membership. ^aThis paper is represented twice as it contains two separate studies. ^bN includes a non-musician group that did not complete the pitch-naming task

**Fig. 4**
Mean performance of AP participants in studies using raw accuracy scores. *Note. Error bars* are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The mean is shown by the *grey vertical line*

**Fig. 5**
Mean performance of AP participants in studies assigning credit to semitone errors. *Note. Error bars* are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The mean is shown by the *grey vertical line*

**Fig. 6**
Mean performance of non-AP participants in studies using raw accuracy scores. *Note. Error bars* are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The *blue vertical line* indicates chance performance (8.3%), and the mean is shown by the *grey vertical line*

**Fig. 7**
Mean performance of non-AP participants in studies assigning credit to semitone errors. *Note. Error bars* are 95% confidence intervals around the mean (omitted when relevant data were unavailable). As chance performance varies according to the amount of credit assigned to semitone errors, a chance line is not included. The mean is shown by the *grey vertical line*

**Fig. 8**
Pitch range of pitch-naming task stimuli. *Note.* (A) The pitch range as reported for 139/157 tasks. Each *blue line* represents a single task, with endpoints representing the upper and lower limits of each task’s specified range. Middle C (C4) is indicated with a *red vertical line*, while the range of a piano is shown by *green vertical lines*. (B) The correlation between mean pitch-naming performance and task stimulus range for all tasks regardless of scoring method, n = 77

**Fig. 9**
Timbres used in the pitch-naming tasks. *Note.* (A) Proportion of tasks using different timbres. (B) Mean pitch-naming performance of AP groups for these different timbres. *Error bars* are 95% confidence intervals around the mean

**Fig. 10**
Mean pitch-naming performance of AP groups for tasks with varying numbers of trials

**Fig. 11**
Mean pitch-naming performance of AP groups for tasks with varying stimulus duration

**Fig. 12**
Mean pitch-naming performance of AP groups for tasks with varying response window

**Fig. 13**
Mean pitch-naming performance of AP groups across response methods. *Note.* All tasks reporting the mean for their AP group are included in this figure regardless of scoring method (n = 66). *Error bars* are 95% confidence intervals around the mean

**Fig. 14**
Mean pitch-naming performance of AP groups according to the presence of a distracter sound. *Note.* All studies reporting the mean for their AP group are included in this figure regardless of scoring method. *Error bars* are 95% confidence intervals around the mean

See this image and copyright information in PMC

References

1. Abdellaoui, A., & Verweij, K. J. H. (2021). Dissecting polygenic signals from genome-wide association studies on human behaviour. Nature Human Behaviour,5, 686–694. 10.1038/s41562-021-01110-y - PubMed
1. Acevedo, S., Temperley, D., & Pfordresher, P. Q. (2014). Effects of metrical encoding on melody recognition. Music Perception,31, 372–386. 10.1525/mp.2014.31.4.372
1. Akiva-Kabiri, L., & Henik, A. (2012). A unique asymmetrical Stroop effect in absolute pitch possessors. Experimental Psychology,59, 272–278. 10.1027/1618-3169/a000153 - PubMed
1. Aruffo, C., Goldstone, R. L., & Earn, D. J. D. (2014). Absolute judgment of musical interval width. Music Perception,32, 186–200. 10.1525/MP.2014.32.2.186
1. Athos, E. A., Levinson, B., Kistler, A., Zemansky, J., Bostrom, A., Freimer, N., & Gitschier, J. (2007). Dichotomy and perceptual distortions in absolute pitch ability. Proceedings of the National Academy of Sciences of the United States of America,104, 14795–14800. 10.1073/pnas.0703868104 - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Affiliations

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources