Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 23;180(2):248-262.e21.
doi: 10.1016/j.cell.2019.12.015.

Widespread Transcriptional Scanning in the Testis Modulates Gene Evolution Rates

Affiliations

Widespread Transcriptional Scanning in the Testis Modulates Gene Evolution Rates

Bo Xia et al. Cell. .

Abstract

The testis expresses the largest number of genes of any mammalian organ, a finding that has long puzzled molecular biologists. Our single-cell transcriptomic data of human and mouse spermatogenesis provide evidence that this widespread transcription maintains DNA sequence integrity in the male germline by correcting DNA damage through a mechanism we term transcriptional scanning. We find that genes expressed during spermatogenesis display lower mutation rates on the transcribed strand and have low diversity in the population. Moreover, this effect is fine-tuned by the level of gene expression during spermatogenesis. The unexpressed genes, which in our model do not benefit from transcriptional scanning, diverge faster over evolutionary timescales and are enriched for sensory and immune-defense functions. Collectively, we propose that transcriptional scanning shapes germline mutation signatures and modulates mutation rates in a gene-specific manner, maintaining DNA sequence integrity for the bulk of genes but allowing for faster evolution in a specific subset.

PubMed Disclaimer

Conflict of interest statement

Competing interests

J.D.B is a founder and Director of the following: Neochromosome, Inc., the Center of Excellence for Engineering Biology, and CDI Labs, Inc. and serves or has recently served on the Scientific Advisory Board of the following: Modern Meadow, Inc., Recombinetics, Inc., Sample6, Inc., and Sangamo, Inc. These arrangements are reviewed and managed by the committee on conflict of interest at NYULH. All other authors declare no competing interests.

Figures

Figure 1.
Figure 1.. scRNA-Seq reveals a detailed molecular map of human spermatogenesis.
(A) Schematic of developmental stages of human spermatogenesis. (B) Dimension reduction analysis (PCA and tSNE) of human testes scRNA-Seq results. Colors indicate the main spermatogenic stages and somatic cell types (see Figure S1 and SI methods). (C) PCA on the spermatogenic-complement of the single-cell data. Arrows and large arrowheads indicate the RNA velocity algorithm (La Manno et al., 2018) predicted developmental trajectory and transcriptionally inactive stages during spermatogenesis, respectively. (D-E) Heatmap (D) and plots (E) of the expression patterns of all human protein-coding genes throughout spermatogenesis according to k-means method-defined gene clusters (see STAR methods). See also Figure S1, S2 and Table S1, S2 and S3.
Figure 2.
Figure 2.. Widespread transcription in spermatogenic cells is associated with reduced germline mutation rates.
(A) Two possible consequences of widespread transcription in spermatogenic cells: transcription-coupled DNA damage and transcription-coupled repair. (B) Germline SNV rates in the gene body across the gene clusters, as determined in Figure 1D. (C) Germline SNV rates in the gene body of expressed and unexpressed genes across large gene families (see STAR Methods). (D) Germline SNV rates in the gene body across gene sets as determined by binarized expression (expressed versus unexpressed) in testicular germ cells and somatic cells. (E) Ratios of germline SNV rates of unexpressed genes versus the expressed genes determined from diverse human organs and cell types. Points represent individual tissue samples collected by the GTEx-project (GTEx Consortium, 2015). Significance in B-D is computed by the Mann-Whitney test with Bonferroni correction for multiple tests. Error bars indicate 99% confidence intervals calculated by bootstrap method with n=10,000 (see STAR Methods, same for Figures below). See also Figure S3 and Table S3.
Figure 3.
Figure 3.. TCR-associated mutation asymmetry scores show bidirectional transcription and extended transcription signatures.
(A) Schematic of a transcribed gene with the template strand containing lower DNA damage and, consequently, a lower mutation rate. (B) Distinguishing germline mutations according to coding and template strands (see STAR Methods). (C) A-to-T transversion mutation rates of the coding and the template strands for the spermatogenic gene categories (paired-sample t-test). Dashed line indicates the average SNV rate in the unexpressed genes. (D) Asymmetry scores throughout spermatogenic gene categories (see STAR Methods). (E) Schematic of gene architecture indicating bidirectional and extended transcription. (F-H) Asymmetry scores in the upstream 5kb region (F), gene body (G), represented by intron regions, and downstream 5kb region (H) across all six mutation types (Mann-Whitney test). Significance p-values were adjusted for multiple tests with Bonferroni method. *, P<0.01; **, P<1.0e−6; n.s., not significant. See also Figure S4 and S5.
Figure 4.
Figure 4.. TCR-induced mutational signatures considering sequence contexts.
(A) Human intronic germline mutation rates in the spermatogenesis-expressed genes. The mutation rates considered the adjacent bases and distinguished the coding/template strands. (B) Human germline mutation asymmetry scores according to adjacent bases in the spermatogenesis-expressed genes. (C-D) Human asymmetry score pairs distinguished by 3’- (C) or 5’- (D) adjacent bases. For each pair of points in a given mutation type, the asymmetry scores were plotted in a purine (left) – to – pyrimidine (right) fashion in terms of 3’- (C) or 5’- (D) adjacent base. (E-H) Same as shown in A-D, but for mouse germline mutations in the intron regions. Significance in C-D and G-H were computed by paired-sample t-test with Bonferroni correction for multiple tests. See also Figure S6.
Figure 5.
Figure 5.. Transcriptional scanning-induced mutation reduction is tuned by gene-expression levels.
(A) Schematic of transcriptional scanning of DNA damage in male germ cells. (B) Genes were binned to nine expression level groups, from unexpressed (Unexp) to highly expressed (High-exp) (Table S6 and see STAR Methods). (C) Intronic SNV rates across gene expression level categories (Mann-Whitney test). (D) Intronic SNV rate distributions of the indicated germline mutation types across gene expression level categories, and distinguished by coding and template strands (paired-sample t-test). (E) Distribution of asymmetry scores between coding and template strands for the mutation types indicated in (D) (Mann-Whitney test). (F) Expression level tuning of germline mutation rates following additive contributions by transcription-coupled repair (TCR-reduced) and transcription-coupled damage-induced (TCD-induced) effects. The observed germline mutation rate distribution represents average mutation rates across 100 evenly-binned expression levels, with background shadow indicating 99% confidence intervals. Significance p-values were adjusted for multiple tests with Bonferroni method. *, P<0.01; **, P<1.0e−6; n.s., not significant. See also Table S6.
Figure 6.
Figure 6.. De novo germline mutations exhibit spermatogenesis expression-dependent mutational signatures.
(A) DNM rates across the spermatogenesis gene clusters, as determined in Figure 1D. (B) DNM rates across spermatogenesis gene expression level categories, as determined in Figure 5B. (C-D) DNM rates (C) and asymmetry scores (D) regarding to local sequence contexts and coding/template strands in the spermatogenesis-expressed genes. (E-F) Correlations between the SNV rates and scaled DNM rates on the coding strand (E) and on the template strand (F), respectively. (G) Correlation between the asymmetry scores defined from SNVs and from DNMs. Each dot in E-G represents a mutation subtype which considers 5’- and 3’-adjacent bases referring to the reference base. We excluded the dots representing C-to-T mutating rates in the CpG contexts in (E) and (F), though including such outlier dots would further increase the correlation coefficients. Significance in A-B was computed by the Mann-Whitney test with Bonferroni method correction for multiple tests.
Figure 7.
Figure 7.. Evolutionary consequences of transcriptional scanning in male germ cells.
(A) Gene ontology terms enriched in the set of genes unexpressed during spermatogenesis. (B-C) DNA divergence levels (B) and dS scores (C) of human genes with their orthologous in the indicated apes, according to gene expression-pattern clusters. Gray dashed box highlights the male germ cell-unexpressed gene cluster. (D) Schematic of transcriptional scanning in biasing germline mutation rates and its evolutionary impact. (E) A revised model for generating biased DNA sequence variation and gene evolution. See also Figure S7 and Table S7.

Comment in

References

    1. Acuna-Hidalgo R, Veltman JA, and Hoischen A (2016). New insights into the generation and role of de novo mutations in health and disease. Genome Biol. 17, 241. - PMC - PubMed
    1. Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SAJR, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Børresen-Dale A-L, et al. (2013). Signatures of mutational processes in human cancer. Nature 500, 415–421. - PMC - PubMed
    1. An J-Y, Lin K, Zhu L, Werling DM, Dong S, Brand H, Wang HZ, Zhao X, Schwartz GB, Collins RL, et al. (2018). Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science (80−. ). 362. - PMC - PubMed
    1. Arbeithuber B, Betancourt AJ, Ebner T, and Tiemann-Boege I (2015). Crossovers are associated with mutation and biased gene conversion at recombination hotspots. Proc. Natl. Acad. Sci. USA 112, 2109–2114. - PMC - PubMed
    1. Barnes DE, and Lindahl T (2004). Repair and genetic consequences of endogenous DNA base damage in mammalian cells. Annu. Rev. Genet. 38, 445–476. - PubMed

Publication types