Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 30;119(35):e2202764119.
doi: 10.1073/pnas.2202764119. Epub 2022 Aug 23.

Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people

Else Eising  1 Nazanin Mirza-Schreiber  2 Eveline L de Zeeuw  3 Carol A Wang  4   5 Dongnhu T Truong  6 Andrea G Allegrini  7 Chin Yang Shapland  8   9 Gu Zhu  10 Karen G Wigg  11 Margot L Gerritse  1 Barbara Molz  1 Gökberk Alagöz  1 Alessandro Gialluisi  12   13   14 Filippo Abbondanza  15 Kaili Rimfeld  7   16 Marjolein van Donkelaar  1 Zhijie Liao  17 Philip R Jansen  18   19   20 Till F M Andlauer  12   21 Timothy C Bates  22 Manon Bernard  23 Kirsten Blokland  24 Milene Bonte  25 Anders D Børglum  26   27   28 Thomas Bourgeron  29 Daniel Brandeis  30   31   32   33 Fabiola Ceroni  34   35 Valéria Csépe  36   37 Philip S Dale  38 Peter F de Jong  39 John C DeFries  40   41 Jean-François Démonet  42 Ditte Demontis  26   27 Yu Feng  11 Scott D Gordon  10 Sharon L Guger  43 Marianna E Hayiou-Thomas  44 Juan A Hernández-Cabrera  45 Jouke-Jan Hottenga  3 Charles Hulme  46 Juha Kere  47   48 Elizabeth N Kerr  43   49   50 Tanner Koomar  51 Karin Landerl  52   53 Gabriel T Leonard  54 Maureen W Lovett  24   50 Heikki Lyytinen  55 Nicholas G Martin  10 Angela Martinelli  15 Urs Maurer  56 Jacob J Michaelson  51 Kristina Moll  57 Anthony P Monaco  58 Angela T Morgan  59   60   61 Markus M Nöthen  62 Zdenka Pausova  23   63 Craig E Pennell  4   5   64 Bruce F Pennington  65 Kaitlyn M Price  11   24   66 Veera M Rajagopal  26   27 Franck Ramus  67 Louis Richer  68 Nuala H Simpson  69 Shelley D Smith  70 Margaret J Snowling  69   71 John Stein  72 Lisa J Strug  73   74 Joel B Talcott  75 Henning Tiemeier  18   76 Marc P van der Schroeff  77   78 Ellen Verhoef  1 Kate E Watkins  69 Margaret Wilkinson  24 Margaret J Wright  79 Cathy L Barr  11   24   66 Dorret I Boomsma  3   80   81 Manuel Carreiras  82   83   84 Marie-Christine J Franken  77 Jeffrey R Gruen  6 Michelle Luciano  22 Bertram Müller-Myhsok  12   85 Dianne F Newbury  35 Richard K Olson  40 Silvia Paracchini  15 Tomáš Paus  86 Robert Plomin  7 Sheena Reilly  59   87 Gerd Schulte-Körne  57 J Bruce Tomblin  88 Elsje van Bergen  3   80   89 Andrew J O Whitehouse  90 Erik G Willcutt  41 Beate St Pourcain  1   8   91 Clyde Francks  1   91   92 Simon E Fisher  1   91
Affiliations

Genome-wide analyses of individual differences in quantitatively assessed reading- and language-related skills in up to 34,000 people

Else Eising et al. Proc Natl Acad Sci U S A. .

Abstract

The use of spoken and written language is a fundamental human capacity. Individual differences in reading- and language-related skills are influenced by genetic variation, with twin-based heritability estimates of 30 to 80% depending on the trait. The genetic architecture is complex, heterogeneous, and multifactorial, but investigations of contributions of single-nucleotide polymorphisms (SNPs) were thus far underpowered. We present a multicohort genome-wide association study (GWAS) of five traits assessed individually using psychometric measures (word reading, nonword reading, spelling, phoneme awareness, and nonword repetition) in samples of 13,633 to 33,959 participants aged 5 to 26 y. We identified genome-wide significant association with word reading (rs11208009, P = 1.098 × 10-8) at a locus that has not been associated with intelligence or educational attainment. All five reading-/language-related traits showed robust SNP heritability, accounting for 13 to 26% of trait variability. Genomic structural equation modeling revealed a shared genetic factor explaining most of the variation in word/nonword reading, spelling, and phoneme awareness, which only partially overlapped with genetic variation contributing to nonword repetition, intelligence, and educational attainment. A multivariate GWAS of word/nonword reading, spelling, and phoneme awareness maximized power for follow-up investigation. Genetic correlation analysis with neuroimaging traits identified an association with the surface area of the banks of the left superior temporal sulcus, a brain region linked to the processing of spoken and written language. Heritability was enriched for genomic elements regulating gene expression in the fetal brain and in chromosomal regions that are depleted of Neanderthal variants. Together, these results provide avenues for deciphering the biological underpinnings of uniquely human traits.

Keywords: genome-wide association study; language; meta-analysis; reading.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Reading- and language-related traits have a shared genetic architecture that is largely independent of performance IQ. (A) Genetic correlations (rg) among the reading- and language-related traits estimated with LDSC. Estimates are capped at one. Full LDSC results are reported in Dataset S4. In addition, genetic correlations are given between the GenLang traits and 1) performance IQ (using GenLang cohorts only); 2) educational attainment (EA; n = 766,345) and full-scale IQ (n = 257,828) (18); 3) noncognitive abilities involved in EA, resulting from a recent GWAS by subtraction study (n = 510,795) (19); and 4) components associated with distinct performance domains identified used a decomposition analysis of Danish school grades (n = 30,982) (20). Full results can be found in Dataset S8. *Significant genetic correlation after correction for 18.28 independent comparisons (P < 2.74 × 10−3); **P < 2.74 × 10−4; ***P < 2.74 × 10−5. (B) Three-factor model fitted to the GenLang summary statistics for word reading, nonword reading, spelling, phoneme awareness, nonword repetition, and performance IQ and to published GWAS summary statistics for full-scale IQ and EA (18) using GenomicSEM (21). Black and gray paths represent factor loadings with P < 0.05 and P > 0.05, respectively. Standardized factor loadings are shown, with SE in parentheses. The subscript g represents the genetic variables; the u variables represent the residual genetic variance not explained by the models. Unstandardized results and model fit indices are reported in Dataset S9.
Fig. 2.
Fig. 2.
The multivariate GenLang GWAS results show significant genetic correlation with the cortical surface area around the left STS. Genetic correlations (rg) were estimated with LDSC. Included traits are 58 structural brain imaging traits from the UK Biobank selected based on known links of regions and circuits with language processing. The results of the 22 cortical surface areas are shown; gray areas were not included in the analysis. Full results can be found in Dataset S12 and SI Appendix, Figs. S6 and S7. *Significant genetic correlation after correcting for 24.85 independent brain imaging traits (P < 2.01 × 10−3).
Fig. 3.
Fig. 3.
Genetic correlation results of the multivariate GenLang GWAS analysis with comparisons with those for the largest published GWAS of full-scale IQ in LD Hub. Summary statistics for full-scale IQ (n = 257,828) were obtained from the Social Science Genetic Association Consortium (18). Genetic correlations between the multivariate GenLang results (blue–green), full-scale IQ (purple), and traits in LD Hub reveal an overlap with cognitive traits, education, eyesight, chronotype, lifestyle, well-being, psychiatric disorders, pain, physical health and exercise, and socioeconomic status. A subset of representative traits is shown; 143 traits showed significant associations with the multivariate GenLang results, and 245 traits showed significant correlations with full-scale IQ, of which 135 traits overlap after correction for multiple testing for 535 × 2 traits (P < 4.67 × 10−5). Significant correlations are shown in dark colors; nonsignificant correlations are in light colors. Full results can be found in Dataset S13. UKBB: UK Biobank, GCSE: General Certificate of Secondary Education, NVQ: National Vocational Qualification, HND: Higher National Diploma, HNC: Higher National Certificate, CSE: Certificate of Secondary Education, PGC: Psychiatric Genomics Consortium, BMI: body mass index, SES: socioeconomic status. Genetic correlation (rg) is presented as a dot, and error bars indicate the SE.

References

    1. Fisher S. E., Marcus G. F., The eloquent ape: Genes, brains and the evolution of language. Nat. Rev. Genet. 7, 9–20 (2006). - PubMed
    1. Deriziotis P., Fisher S. E., Speech and language: Translating the genome. Trends Genet. 33, 642–656 (2017). - PubMed
    1. Andreola C., et al. , The heritability of reading and reading-related neurocognitive components: A multi-level meta-analysis. Neurosci. Biobehav. Rev. 121, 175–200 (2021). - PubMed
    1. Graham S. A., Fisher S. E., Understanding language from a genomic perspective. Annu. Rev. Genet. 49, 131–160 (2015). - PubMed
    1. St Pourcain B., et al. , Common variation near ROBO2 is associated with expressive vocabulary in infancy. Nat. Commun. 5, 4831 (2014). - PMC - PubMed

Publication types

Grants and funding