Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 21;57(2):61.
doi: 10.3758/s13428-024-02577-z.

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Affiliations

Conceptual coherence but methodological mayhem: A systematic review of absolute pitch phenotyping

Jane E Bairnsfather et al. Behav Res Methods. .

Abstract

Despite extensive research on absolute pitch (AP), there remains no gold-standard task to measure its presence or extent. This systematic review investigated the methods of pitch-naming tasks for the classification of individuals with AP and examined how our understanding of the AP phenotype is affected by variability in the tasks used to measure it. Data extracted from 160 studies (N = 23,221 participants) included (i) the definition of AP, (ii) task characteristics, (iii) scoring method, and (iv) participant scores. While there was near-universal agreement (99%) in the conceptual definition of AP, task characteristics such as stimulus range and timbre varied greatly. Ninety-five studies (59%) specified a pitch-naming accuracy threshold for AP classification, which ranged from 20 to 100% (mean = 77%, SD = 20), with additional variability introduced by 31 studies that assigned credit to semitone errors. When examining participants' performance rather than predetermined thresholds, mean task accuracy (not including semitone errors) was 85.9% (SD = 10.8) for AP participants and 17.0% (SD = 10.5) for non-AP participants. This review shows that the characterisation of the AP phenotype varies based on methodological choices in tasks and scoring, limiting the generalisability of individual studies. To promote a more coherent approach to AP phenotyping, recommendations about the characteristics of a gold-standard pitch-naming task are provided based on the review findings. Future work should also use data-driven techniques to characterise phenotypic variability to support the development of a taxonomy of AP phenotypes to advance our understanding of its mechanisms and genetic basis.

Keywords: Absolute pitch; Heritability; Methods; Phenotype.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: The authors declare no conflicts of interest pertaining to this manuscript. Ethics approval: As a review of published data, no ethics approval was required. Consent to participate: Not applicable. Consent for publication: Not applicable.

Figures

Fig. 1
Fig. 1
Search strategy for 2019 search. Note. Further details for removal of studies not meeting inclusion criteria can be found in Table 2
Fig. 2
Fig. 2
Publication trees showing relationships among tasks and their replication. Note. Studies are linked by arrows, with the arrowhead pointing towards the study that cites the previous study’s task. Dotted lines are for readability and are used the same way as solid lines. (A) Tasks used in multiple subsequent studies. Tasks that are direct replications of their parent task are in plain text, while those that are adaptations are in grey. (B) Tasks derived from reviews. No replication/adaptation distinction is made here as the source papers do not include specific tasks
Fig. 2
Fig. 2
Publication trees showing relationships among tasks and their replication. Note. Studies are linked by arrows, with the arrowhead pointing towards the study that cites the previous study’s task. Dotted lines are for readability and are used the same way as solid lines. (A) Tasks used in multiple subsequent studies. Tasks that are direct replications of their parent task are in plain text, while those that are adaptations are in grey. (B) Tasks derived from reviews. No replication/adaptation distinction is made here as the source papers do not include specific tasks
Fig. 3
Fig. 3
Accuracy thresholds used across studies. Note. Scores classified into AP, non-AP, and intermediate groups are shown by shaded bars. Green bars refer to AP performance, red bars to non-AP performance, and teal/blue/orange refer to intermediate groups. Lighter versions of the colours (e.g., Gruhn et al., 2018) indicate studies for which semitone error credit was applied. Hou et al., (2014, 2021, 2023) include a cross-hatching over the AP group to denote that only white-key notes were used in this task. Diagonal fill indicates that no non-AP groups completed the pitch-naming task in these studies. Asterisks next to study names indicate that additional metrics were used beyond these thresholds to determine group membership. aThis paper is represented twice as it contains two separate studies. bN includes a non-musician group that did not complete the pitch-naming task
Fig. 4
Fig. 4
Mean performance of AP participants in studies using raw accuracy scores. Note. Error bars are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The mean is shown by the grey vertical line
Fig. 5
Fig. 5
Mean performance of AP participants in studies assigning credit to semitone errors. Note. Error bars are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The mean is shown by the grey vertical line
Fig. 6
Fig. 6
Mean performance of non-AP participants in studies using raw accuracy scores. Note. Error bars are 95% confidence intervals around the mean (omitted when relevant data were unavailable). The blue vertical line indicates chance performance (8.3%), and the mean is shown by the grey vertical line
Fig. 7
Fig. 7
Mean performance of non-AP participants in studies assigning credit to semitone errors. Note. Error bars are 95% confidence intervals around the mean (omitted when relevant data were unavailable). As chance performance varies according to the amount of credit assigned to semitone errors, a chance line is not included. The mean is shown by the grey vertical line
Fig. 8
Fig. 8
Pitch range of pitch-naming task stimuli. Note. (A) The pitch range as reported for 139/157 tasks. Each blue line represents a single task, with endpoints representing the upper and lower limits of each task’s specified range. Middle C (C4) is indicated with a red vertical line, while the range of a piano is shown by green vertical lines. (B) The correlation between mean pitch-naming performance and task stimulus range for all tasks regardless of scoring method, n = 77
Fig. 9
Fig. 9
Timbres used in the pitch-naming tasks. Note. (A) Proportion of tasks using different timbres. (B) Mean pitch-naming performance of AP groups for these different timbres. Error bars are 95% confidence intervals around the mean
Fig. 10
Fig. 10
Mean pitch-naming performance of AP groups for tasks with varying numbers of trials
Fig. 11
Fig. 11
Mean pitch-naming performance of AP groups for tasks with varying stimulus duration
Fig. 12
Fig. 12
Mean pitch-naming performance of AP groups for tasks with varying response window
Fig. 13
Fig. 13
Mean pitch-naming performance of AP groups across response methods. Note. All tasks reporting the mean for their AP group are included in this figure regardless of scoring method (n = 66). Error bars are 95% confidence intervals around the mean
Fig. 14
Fig. 14
Mean pitch-naming performance of AP groups according to the presence of a distracter sound. Note. All studies reporting the mean for their AP group are included in this figure regardless of scoring method. Error bars are 95% confidence intervals around the mean

Similar articles

References

    1. Abdellaoui, A., & Verweij, K. J. H. (2021). Dissecting polygenic signals from genome-wide association studies on human behaviour. Nature Human Behaviour,5, 686–694. 10.1038/s41562-021-01110-y - PubMed
    1. Acevedo, S., Temperley, D., & Pfordresher, P. Q. (2014). Effects of metrical encoding on melody recognition. Music Perception,31, 372–386. 10.1525/mp.2014.31.4.372
    1. Akiva-Kabiri, L., & Henik, A. (2012). A unique asymmetrical Stroop effect in absolute pitch possessors. Experimental Psychology,59, 272–278. 10.1027/1618-3169/a000153 - PubMed
    1. Aruffo, C., Goldstone, R. L., & Earn, D. J. D. (2014). Absolute judgment of musical interval width. Music Perception,32, 186–200. 10.1525/MP.2014.32.2.186
    1. Athos, E. A., Levinson, B., Kistler, A., Zemansky, J., Bostrom, A., Freimer, N., & Gitschier, J. (2007). Dichotomy and perceptual distortions in absolute pitch ability. Proceedings of the National Academy of Sciences of the United States of America,104, 14795–14800. 10.1073/pnas.0703868104 - PMC - PubMed

Publication types

LinkOut - more resources