Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Nov 22;386(6724):eadn0609.
doi: 10.1126/science.adn0609. Epub 2024 Nov 22.

Reconstruction of the human amylase locus reveals ancient duplications seeding modern-day variation

Collaborators, Affiliations

Reconstruction of the human amylase locus reveals ancient duplications seeding modern-day variation

Feyza Yilmaz et al. Science. .

Abstract

Previous studies suggested that the copy number of the human salivary amylase gene, AMY1, correlates with starch-rich diets. However, evolutionary analyses are hampered by the absence of accurate, sequence-resolved haplotype variation maps. We identified 30 structurally distinct haplotypes at nucleotide resolution among 98 present-day humans, revealing that the coding sequences of AMY1 copies are evolving under negative selection. Genomic analyses of these haplotypes in archaic hominins and ancient human genomes suggest that a common three-copy haplotype, dating as far back as 800,000 years ago, has seeded rapidly evolving rearrangements through recurrent nonallelic homologous recombination. Additionally, haplotypes with more than three AMY1 copies have significantly increased in frequency among European farmers over the past 4000 years, potentially as an adaptive response to increased starch digestion.

PubMed Disclaimer

Conflict of interest statement

Competing interests: C.L. is a scientific advisory board member of Nabsys and Genome Insight.

Figures

Fig. 1.
Fig. 1.. Amylase structural haplotypes identified from present-day humans in this study.
(A) Segmental duplications (light to dark gray: 90 to 98% similarity; light to dark orange: >99% similarity), GENCODE V44 gene annotations, and long terminal repeats (LTRs) are represented as tracks. The lower panel shows amylase segments (colored arrows), and haplotype structures of GRCh38 and T2T-chm13 reference assemblies, represented as in silico maps with white backgrounds and vertical blue lines displaying optical mapping labels. The AMY2B segment overlaps the AMY2B gene, AMY2A.1 and AMY2A.2 segments overlap the AMY2A gene, and the AMY1 segment overlaps the AMY1 gene. (B) The high-confidence amylase structural haplotypes resolved in our dataset (n = 30). The vertical black line in the second AMY1 segment of H3r.3 represents the polymorphic label present in three alleles. The diagonal stripes in the second AMY2B segment of H3B2.1 indicate that it is a partial copy of the first AMY2B segment. Haplotype IDs: HX: X denotes the number of AMY1 copies; AY: Y denotes the number of AMY2A copies; BZ: Z denotes the number of AMY2B copies. The superscript “a” denotes the ancestral amylase haplotype structure, and the superscript “r” denotes the reference amylase haplotype structure. The number in parentheses indicates the number of alleles. (C) The distribution of common amylase haplotypes across 26 population samples. (D) The proportion of singletons for tandem repeat loci (EnsembleTR) across the genome. For adequate comparison, we used the same individuals (n = 33) for whom we were able to reconstruct amylase haplotypes in our dataset. Additionally, we filtered the tandem repeat loci (719 loci, unit length 1 to 6 bp) that we analyzed to match the number of distinct alleles (n = 21) observed in the amylase locus. The asterisk (*) represents the proportion of singletons among all distinct haplotypes (~67%, 14 out of 21) detected at the amylase locus. (E) Rarefaction and extrapolation sampling curve based on 52 amylase haplotypes, displaying how the number of distinct haplotypes (blue line) is projected to saturate with the increase in the number of alleles. The rate of change (red line, 0.15) indicates the number of previously unknown haplotypes discovered per unit increase in the number of analyzed alleles. The dashed line shows the proportion of estimated number of samples (85.25%) captured in our study.
Fig. 2.
Fig. 2.. The variants in amylase coding sequences and negative selection on three amylase gene types.
(A) The maximum likelihood phylogenetic tree of amylase coding sequences (left) and dN/dS estimate for each amylase gene type (right). The phylogenetic tree is rooted with a coding sequence from the sheep genome (Oar_rambouillet_v1.0). The number of nucleotide and resultant amino acid changes that are paralog-specific are indicated. The numbers in parentheses indicate nucleotide and amino acid changes that are variable within the AMY1 branch. The numbers to the right of each bar represent the FDR-adjusted P value for the likelihood ratio between H02 (dN/dS ratio is fixed to one on the foreground branches) and H1 (two dN/dS ratios are allowed on the foreground and background branches, respectively) for each gene type. (B) The positions of amino acid variants within and between AMY2B, AMY2A, and AMY1 protein sequences. The 211 and 366 positions are highlighted because they overlap with a conserved section of the amylase protein sequence and have a predicted functional impact [AlphaMissense (90)]. The coordinates are based on the residues of the amylase enzyme from UniProtKB (accession: P0DUB6). Single-letter abbreviations for the amino acid residues are as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; W, Trp; and Y, Tyr.
Fig. 3.
Fig. 3.. Amylase gene duplication and the history of agriculture.
(A) The read depth of amylase locus spanning RNPC3, AMY2B, AMY2A, and AMY1 genes (chr1:103494306 to 103668306) for GoyetQ56-1 (maroon line), a Neanderthal excavated in present-day Belgium, showing signatures of AMY1 duplication, and for Denisova 11 (beige line), a hybrid hominin (Neanderthal and Denisovan) excavated from present-day Russia, showing signatures consistent with an ancestral single-copy AMY1 haplotype. The maroon and beige lines indicate average read depths with 5-kbp window and 1-kbp step for each sample. The average read depth of each 5-kbp window was normalized by the average read depth of the RNPC3 gene. Only uniquely mapped reads were used for this visualization. (B) A world map displaying the locations of ancient human samples. Sample locations are indicated with an “X”, with corresponding hexagons showing the estimated AMY1 copy number. Carbon dating (number of years before present; BP) estimated for each sample is indicated in blue. (C) Amylase copy number estimations from Europeans who were farmers (beige) and hunter-gatherers (maroon). Samples are binned on the x axis according to three time periods; preagriculture (before the transition to agriculture, >9000 yr B.P.), during the transition (~9000 to 4000 yr B.P.), and posttransition (complete transition to agriculture, ~4000 to 1000 yr B.P.). The shape of data points corresponds to sample dating estimates (Xs for more than 9000 yr B.P., circles for 9000 to 4000 yr B.P., and triangles for 4000 to 1000 yr B.P.). The inset of the right panel shows nonparametric regression lines for AMY1 copy number across time, for hunter-gatherers (maroon) and farmers (beige), with confidence intervals in lighter corresponding color, respectively. (D) Zoomed-in map showing the spread of agriculture into Europe from Asia. Major agricultural footholds are indicated by wheat pictograms. Tan arrows show general trends of human agricultural migration throughout Europe, with predicted time periods (58, 59). Ancient human samples that were analyzed for amylase copy number are annotated with shapes and colors for time period and lifestyle, respectively. [Figure created with BioRender]
Fig. 4.
Fig. 4.. The evolutionary and mutational connections among common haplotypes.
(A) The structural variation breakpoints and recombination hotspots in the amylase locus. The colored arrows represent amylase segments. The PRDM9 binding sites are represented with red dots. The nonallelic homologous recombination (NAHR) breakpoints are represented with purple, green, and orange dots, and dashed lines. The microhomology-mediated break-induced replication breakpoints are represented with gray dots. (B) NAHR-mediated duplication and deletion of the AMY1A-AMY1B cluster. The AMY2.2 (orange) segment serves as the recombination substrate for the crossover, resulting in the duplication or deletion of the AMY1A-AMY1B cluster as illustrated in the middle panel. Chimeric AMY2.2 segments have been identified, using parsimony informative sites within the AMY2.2 segment (right panel). (C) Microhomology-mediated break-induced replication-based copy number gain resulting in the formation of H2A2B2.1. The middle panel shows the mutational mechanism. Four nucleotides of microhomology internal to the breakends were identified at the breakpoint junction (right panel). [Figure created using BioRender]
Fig. 5.
Fig. 5.. An evolutionary model of the human amylase genes and resulting hypotheses.
Top: A timeline of human amylase locus evolution based on the results of this study, with relevant events indicated on top. Middle: A schematic view showing the increase in AMY1 copy number variation and the mean number of AMY1 copies in present-day human populations throughout history as starch consumption increases. Bottom: A phylogenetic representation of the hypothesized amylase duplication timeline. [Figure created with BioRender]

Update of

References

    1. Perry GH et al. Diet and the evolution of human amylase gene copy number variation. Nat. Genet 39, 1256–1260 (2007). doi: 10.1038/ng2123; - DOI - PMC - PubMed
    1. Nishide T et al. Sequences of cDNAs for human salivary and pancreatic α-amylases. Gene 28, 263–270 (1984). doi: 10.1016/0378-1119(84)90265-8; - DOI - PubMed
    1. Peyrot des Gachons C, Breslin PAS, Salivary Amylase: Digestion and Metabolic Syndrome. Curr. Diab. Rep 16, 102 (2016). doi: 10.1007/s11892-016-0794-7; - DOI - PMC - PubMed
    1. Pajic P et al. Independent amylase gene copy number bursts correlate with dietary preferences in mammals. eLife 8, e44628 (2019). doi: 10.7554/eLife.44628; - DOI - PMC - PubMed
    1. Meisler MH, Ting CN, The remarkable evolutionary history of the human amylase genes. Crit. Rev. Oral Biol. Med 4, 503–509 (1993). doi: 10.1177/10454411930040033501; - DOI - PubMed

LinkOut - more resources