Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 2:11:e80008.
doi: 10.7554/eLife.80008.

A paternal bias in germline mutation is widespread in amniotes and can arise independently of cell division numbers

Affiliations

A paternal bias in germline mutation is widespread in amniotes and can arise independently of cell division numbers

Marc de Manuel et al. Elife. .

Abstract

In humans and other mammals, germline mutations are more likely to arise in fathers than in mothers. Although this sex bias has long been attributed to DNA replication errors in spermatogenesis, recent evidence from humans points to the importance of mutagenic processes that do not depend on cell division, calling into question our understanding of this basic phenomenon. Here, we infer the ratio of paternal-to-maternal mutations, α, in 42 species of amniotes, from putatively neutral substitution rates of sex chromosomes and autosomes. Despite marked differences in gametogenesis, physiologies and environments across species, fathers consistently contribute more mutations than mothers in all the species examined, including mammals, birds, and reptiles. In mammals, α is as high as 4 and correlates with generation times; in birds and snakes, α appears more stable around 2. These observations are consistent with a simple model, in which mutations accrue at equal rates in both sexes during early development and at a higher rate in the male germline after sexual differentiation, with a conserved paternal-to-maternal ratio across species. Thus, α may reflect the relative contributions of two or more developmental phases to total germline mutations, and is expected to depend on generation time even if mutations do not track cell divisions.

Keywords: amniotes; dna damage; dna repair; evolutionary biology; genetics; genomics; germline mutation; none; sex differences.

PubMed Disclaimer

Conflict of interest statement

Md, FW No competing interests declared, MP Senior editor, eLife

Figures

Figure 1.
Figure 1.. Estimating the paternal bias in mutation from neutral substitution rates of sex chromosomes and autosomes.
(A) On average, the lineage of an X chromosome spends fewer generations in males than females. Given a higher mutation rate in males than in females and all else being equal, this leads to lower rates of neutral substitutions on the X chromosome compared to autosomes (Miyata et al., 1987). (B) Procedure for estimating the ratio of paternal-to-maternal mutation rates, α, from substitution rates in sex chromosomes and autosomes. The autosomes and the X chromosome are partitioned into 1 Mb windows, depicted in purple and orange, respectively. Each window is filtered to focus on putatively neutrally evolving sequences (see Selecting non-repetitive and putatively neutral sequences in Materials and methods), and its GC content is calculated (represented by shading). The putatively neutral substitution rates per window are then regressed against the GC content (center panel, see Estimating α from X-to-autosome substitution rate ratios in Materials and methods). Substitution rate estimates for the X chromosome and autosomes are obtained from the regression fit (red points). Finally, the ratio of the point estimates is converted to an estimate of α (right panel). An analogous procedure applies to comparisons of the Z chromosome and autosomes in a ZW sex determination system.
Figure 1—figure supplement 1.
Figure 1—figure supplement 1.. Identification of pseudo-autosomal regions in Thamnophis.
Depth of coverage along the Z chromosome of a heterogametic Thamnophis sirtalis individual. Each data point represents a 1 Mb window, colored by p-value (red if ≦ 0.05) assuming depth of coverage is Poisson distributed with λ=83.3 (the mean coverage in the autosomes of the same individual) (see Selecting non-repetitive and putatively neutral sequences in Materials and methods). We considered the Z genomic windows with similar coverage to that in the autosomes (gray points) to be pseudo-autosomal regions with homologs on the W chromosome.
Figure 2.
Figure 2.. Estimates of the paternal bias in mutation across 42 amniote lineages.
Colored points denote estimates of α from X (Z)-to-autosome substitution rate ratios (α^evo) in mammals (top, orange), birds (middle, blue), and snakes (bottom, green). Vertical colored lines denote the mean α^evo for each group, while the vertical gray dashed line denotes α=1 (i.e., no sex bias in mutation). Species in each group are plotted by their phylogenetic relationships and branch lengths are scaled by the neutral substitution rate estimated from autosomes (see Estimating putatively neutral substitution rates in Materials and methods). Note that branch lengths are comparable within the phylogeny of each taxon but not across taxa, as the scaling differs (see the legend for each group). In mammals, α^evo was estimated from neutral substitutions along the lineage from the tip to the most recent common ancestor indicated in the phylogeny. In birds, where phylogenetic relationships are more tenuous, we divided species into six subgroups (Supplementary file 5) to avoid highly uncertain ancestral nodes in Neoaves; thus, some α^evo estimates in Neoaves average over deeper splits than suggested by the full phylogeny, which we plot for clarity. Asterisks indicate species with chromosome-level assemblies. Darker colored horizontal lines behind the points represent 95% CIs, which were computed by bootstrap resampling of the 1 Mb genomic windows across 500 replicates; the central 95% interval across bootstrap replicates is shown. Lighter colored horizontal lines include uncertainty in the ratio of paternal-to-maternal generation times, allowing the ratio to range between 0.9 and 1.1 (Amster and Sella, 2016). Short vertical red lines denote point estimates of α^dnm from published pedigree mutation studies of de novo mutations, and the surrounding horizontal gray boxes represent the 95% binomial CI for those estimates.
Figure 2—figure supplement 1.
Figure 2—figure supplement 1.. α^evo for each species, obtained under variants of the pipeline presented in the main text.
In each plot, the α^evo inferred by the pipeline described in Figure 1 and used throughout the manuscript (shown on the x-axis) is compared to α^evo estimates from the same pipeline with one modified condition (y-axis): In (A), the mammalian α^evo is obtained using Mus musculus as reference sequence instead of Homo sapiens (see Sequence alignments in Materials and methods); in (B) replication timing is included as a predictor in the regression (see Estimating α from X-to-autosome substitution rate ratios in Materials and methods); in (C) α^evo is obtained from the ratio of X (Z)-to-autosome mean substitution rates across genomic windows (see Estimating α from X-to-autosome substitution rate ratios in Materials and methods); in (D) α^evo is obtained from the mean of the ratio of X (Z)-to-autosome substitution rates, in windows with similar GC content (see Estimating α from X-to-autosome substitution rate ratios in Materials and methods). Here, α^evo CIs in the y-axis are extremely wide and not shown; in (E) α^evo is obtained after masking all CpG sites in the genome sequence alignment (see Selecting non-repetitive and putatively neutral sequences in Materials and methods); in (F) α^evo is obtained after masking all conserved sites identified by phastCons, in addition to exons (see Selecting non-repetitive and putatively neutral sequences in Materials and methods); in (G) α^evo is estimated for sites unaffected by GC-biased gene conversion (see Estimating putatively neutral substitution rates in Materials and methods); in (H) α^evo is estimated by both excluding conserved regions as in F, and using sites unaffected by GC-biased gene conversion as in G; and in (I) α^evo in birds and snakes is obtained including microchromosomes rather than masking them (see Selecting non-repetitive and putatively neutral sequences in Materials and methods). r2 and mean absolute error (m) are shown in the bottom-right corner of each subplot. For each taxon, the mean α^evo obtained using the modified pipeline are annotated in the top-left corner, and the mean α^evo in the original pipeline can be found in the figure legend. Lines indicate ordinary least squares regression fits.
Figure 2—figure supplement 2.
Figure 2—figure supplement 2.. Expected equilibrium GC content (GC*) in the mammalian X chromosomes.
GC* is calculated as the fraction of the rate of substitutions favored by GC-biased gene conversion out of the total rate of substitutions affected by GC-biased gene conversion (i.e., ATGCATGC+GCAT). To calculate GC* in 1 Mb genomic windows across the X chromosome of mammals, we used estimated rates of putatively neutral substitution for single mutation classes as described in Estimating putatively neutral substitution rates in Materials and methods. The vertical gray dashed lines at GC* = 0.5 are plotted to facilitate comparison between the distributions. The lineages leading to cats and pigs have unusually wide distributions of GC*, consistent with their idiosyncratic recombination landscapes in the X chromosome (Li et al., 2016).
Figure 2—figure supplement 3.
Figure 2—figure supplement 3.. Estimation of α^evo for mutation types affected or unaffected by GC-biased gene conversion (gBGC).
α estimates were obtained from the ratio of X (Z)-to-autosome substitution rates of four mutation types (see Estimating putatively neutral substitution rates in Materials and methods): all substitutions (white boxes), substitutions that gBGC acts against (i.e., changes from strong [S] nucleotides [C & G] to weak [W] nucleotides [A & T], blue), substitutions favored by gBGC (W>S, green), and substitutions unaffected by gBGC (W>W and S>S, orange). Note that the x-axis is log-scaled, and values extending below 0.5 or past 10 are truncated, with hollow markers showing point estimates that are outside of these bounds. Horizontal black lines separate species belonging to mammals, birds, and snakes (from top to bottom).
Figure 3.
Figure 3.. Relationship between α^evo and generation time estimates in mammals and birds.
Estimates of α from X (Z)-to-autosome comparisons are plotted against generation times from the literature (see Supplementary file 2), on a log scale. Lines denote the phylogenetic generalized least squares regression fits in mammals (orange) and birds (blue). λ refers to Pagel’s λ (Pagel, 1999), a measure of the strength of phylogenetic signal, which was inferred via maximum likelihood (see Testing relationships between α and life history traits in Materials and methods). Fixing λ to 1 in birds, as estimated for mammals, did not meaningfully improve the fit (p-value =0.282, r2=0.08).
Figure 3—figure supplement 1.
Figure 3—figure supplement 1.. Relationship between mammalian α^evo and various life history traits.
α estimates from the ratio of X-to-autosome substitution rates were regressed against five life history traits separately, using phylogenetic generalized least squares. The estimates were also regressed against the first two principal components (PCs, see Figure 3—figure supplement 2 and Testing relationships between α and life history traits in Materials and methods) obtained in a PC analysis of the traits gestation, birth weight, adult weight, and generation time. (For reference, the α^evo comparison with generation time from Figure 4 is reproduced here.) Best fit lines are drawn in orange; regression statistics are indicated in the upper left corner of each subplot. Pagel’s λ (Pagel, 1999) was estimated by maximum likelihood (see Testing relationships between α and life history traits in Materials and methods).
Figure 3—figure supplement 2.
Figure 3—figure supplement 2.. Principal component (PC) analysis of four life history traits.
Gestation, birth weight, adult weight, and generation time measurements in mammals were gathered from the literature (see Testing relationships between α and life history traits in Materials and methods and Supplementary file 2) and subjected to PC analysis. Species are projected onto the two leading PCs, which are shown with the amount of variance captured by each in parentheses. Blue arrows denote loadings for the trait variables.
Figure 4.
Figure 4.. Variation in α among mammals may reflect varying exposures to different developmental stages.
(A) Ratio of paternal-to-maternal de novo mutations (DNMs) occurring in early embryogenesis (Early, white points), after the sexual differentiation of the germline (Late, gray points) and in both of these stages combined (Total, red line), for the three mammalian species in which this classification is available (mouse Lindsay et al., 2019, cattle Harland et al., 2017, and human Sasani et al., 2019). For each species, the percentage of DNMs occurring at each stage are indicated and used to scale the size of points. Vertical lines show the 95% binomial CIs. Since the phasing rate is not equal across developmental stages, point estimates for α in Total were computed by extrapolating the proportion of paternally and maternally phased DNMs in each stage to all the DNMs in that stage (i.e., assuming full phasing) (see Estimating α from pedigree studies in vertebrates in Materials and methods). (B) Schematic representation of a model in which α is the outcome of mutation in two developmental stages (see Modeling the effects of germline developmental stages on α in Materials and methods). (C) Expected relationship between α and generation time under the model outlined in B, assuming generation times are the same in both sexes. The increase of α with generation time depends on the paternal mutation rate per year in Late, μm, as illustrated by the purple gradient. (D) Fits of predicted α values to α^evo (orange) and α^dnm (gray). In each species, α is predicted with Equation 1 assuming Me=1.66×10-9 and using μf and μm, the latter estimated from autosomal branch-specific substitution rates per year (α^evo) or as estimated from pedigree sequencing data (α^dnm) (see Modeling the effects of germline developmental stages on α in Materials and methods). The orange and gray lines denote the regression fit using phylogenetic generalized least squares (PGLS). PGLS statistics are shown for the two models (see Figure 3 legend for details).
Figure 4—figure supplement 1.
Figure 4—figure supplement 1.. The maximal value of α depends on the ratio of paternal-to-maternal generation times.
The relationship between generation time and α that is expected using the model depicted in Figure 4B, for different ratios of paternal-to-maternal generation times (‘P-to-M G times’) (blue hues). Namely, we assume that (i) each sex accrues the same expected number of Early mutations per base pair (Me=1.66×10-9), (ii) the ratio of paternal-to-maternal mutation rate per unit of time is 4 in the Late stage (4μf=μf), and (iii) μm equals 1×10-9 per base pair per year (see Modeling the effects of germline developmental stages on α in Materials and methods for more details). The maximum generation time (x-axis) was unrealistically set to 200 years to better show the plateau in α.
Figure 4—figure supplement 2.
Figure 4—figure supplement 2.. Ratio of crypt-to-sperm mutation rate per unit of time in four mammals.
The yearly mutation rate in colonic crypts was obtained from Cagan et al., 2022, for four species in which pedigree sequencing estimates of the mutation rate per generation are also available (Jónsson et al., 2017; Lindsay et al., 2019; Harland et al., 2017; Wang et al., 2022b). To estimate the mutation rate per year in sperm, we used the paternal-to-maternal generation time ratio in each pedigree study and assumed that: (i) each sex accrues the same expected number of Early mutations per base pair (Me=1.66×10-9) and (ii) a ratio of paternal-to-maternal mutation rates per unit of time of 4 in the Late phase (4μf=μm) (see Modeling the effects of germline developmental stages on α in Materials and methods).

References

    1. Abascal F, Harvey LMR, Mitchell E, Lawson ARJ, Lensing SV, Ellis P, Russell AJC, Alcantara RE, Baez-Ortega A, Wang Y, Kwa EJ, Lee-Six H, Cagan A, Coorens THH, Chapman MS, Olafsson S, Leonard S, Jones D, Machado HE, Davies M, Øbro NF, Mahubani KT, Allinson K, Gerstung M, Saeb-Parsy K, Kent DG, Laurenti E, Stratton MR, Rahbari R, Campbell PJ, Osborne RJ, Martincorena I. Somatic mutation landscapes at single-molecule resolution. Nature. 2021;593:405–410. doi: 10.1038/s41586-021-03477-4. - DOI - PubMed
    1. Agarwal I, Przeworski M. Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human x chromosome and autosomes. PNAS. 2019;116:17916–17924. doi: 10.1073/pnas.1900714116. - DOI - PMC - PubMed
    1. Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-Zainal S, Stratton MR. Clock-like mutational processes in human somatic cells. Nature Genetics. 2015;47:1402–1407. doi: 10.1038/ng.3441. - DOI - PMC - PubMed
    1. Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, Boot A, Covington KR, Gordenin DA, Bergstrom EN, Islam SMA, Lopez-Bigas N, Klimczak LJ, McPherson JR, Morganella S, Sabarinathan R, Wheeler DA, Mustonen V, Getz G, Rozen SG, Stratton MR, PCAWG Mutational Signatures Working Group. PCAWG Consortium The repertoire of mutational signatures in human cancer. Nature. 2020;578:94–101. doi: 10.1038/s41586-020-1943-3. - DOI - PMC - PubMed
    1. Amster G, Sella G. Life history effects on the molecular clock of autosomes and sex chromosomes. PNAS. 2016;113:1588–1593. doi: 10.1073/pnas.1515798113. - DOI - PMC - PubMed

Publication types