Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 19;371(6531):eabc6405.
doi: 10.1126/science.abc6405.

Recurrent evolution of vertebrate transcription factors by transposase capture

Affiliations

Recurrent evolution of vertebrate transcription factors by transposase capture

Rachel L Cosby et al. Science. .

Abstract

Genes with novel cellular functions may evolve through exon shuffling, which can assemble novel protein architectures. Here, we show that DNA transposons provide a recurrent supply of materials to assemble protein-coding genes through exon shuffling. We find that transposase domains have been captured-primarily via alternative splicing-to form fusion proteins at least 94 times independently over the course of ~350 million years of tetrapod evolution. We find an excess of transposase DNA binding domains fused to host regulatory domains, especially the Krüppel-associated box (KRAB) domain, and identify four independently evolved KRAB-transposase fusion proteins repressing gene expression in a sequence-specific fashion. The bat-specific KRABINER fusion protein binds its cognate transposons genome-wide and controls a network of genes and cis-regulatory elements. These results illustrate how a transcription factor and its binding sites can emerge.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors declare no competing interests.

Figures

Fig. 1:
Fig. 1:. Gene birth by transposase capture in tetrapods.
Tetrapod phylogenetic tree with boxes representing HTF fusion genes. Color indicates transposase superfamily assimilated. OWM=Old world monkeys; NWM=New world monkeys; GM=Gray mouse; H=Hystricoid; C=Castorid, M=Muroid, Miniopt=Miniopterid, Vesper=Vespertilionid, S.S=soft-shelled; B=bearded dragon; G=Green; B=Burmese python; L=Lacertid; J=Japanese; T=Tropical; A=African; M=Mountain; LCA=last common ancestor; MY=million years.
Fig. 2:
Fig. 2:. Transposase capture by alternative splicing.
A) ZNF112/KRABINER locus in vespertilionid bats. B) Steps required for KRABINER birth. C) Age of fusion genes with (green) or without (gray) evidence for alternative splicing. Fusion age (bottom) determined by the midpoint of age range for each fusion as described in Table S3; top shows qualitative illustration of host transcript loss over time. D) Summary of transposon splice site usage for 9 HTFs, with canonical mammalian splice sites shown as a sequence logo. Red denotes nucleotides in the splice site that diverge from the transposon consensus sequence. SA=splice acceptor, LCA=last common ancestor. ** p<0.01 2-sample Wilcoxon Test.
Fig. 3:
Fig. 3:. Biochemical activities of host-transposase fusion proteins.
A) Diverse host domains are fused to transposases. X-axis specifies the number of HTF genes a given domain is present in; some fusions contain more than one domain. Inset shows representative domain architecture schematic for select host-transposase fusions. B) KRAB-transposase fusions repress gene expression in a sequence-specific manner. C) KRABINER requires both its KRAB and DBD domains to repress gene expression. Y axes in B-C boxplots correspond to mean luminescence relative to the KTF (−) state for each comparison (n≥15). KTF=KRAB-transposase fusion; TIR=terminal inverted repeat; filled triangle = consensus TIR, interrupted triangle = scrambled TIR; +/− = presence/absence of respectively; *** adj. p<0.001; 2-sample Wilcoxon Test, Holm-Bonferroni correction.
Fig. 4:
Fig. 4:. KRABINER regulates transcription of genes and TREs in bat cells.
A) Strategy to generate KRABINER KO and rescue lines. TRE=tet responsive element; CMV=cytomegalovirus. B–C) Summary of transcriptional changes of genes and TREs, respectively, upon loss and restoration of KRABINER. KRABINER regulated genes (up or down) change reciprocally between KO vs WT and WT KRABINER rescue vs KO comparisons. p values calculated via a right-tailed hypergeometric test. DE 1 condition refers to differential transcription in either the KO vs WT or WT KRABINER vs KO comparison. Non-specific refers to a gene rescued by WT KRABINER and one or both mutDBD and mutKRAB variants. Unchanged refers to genes/TREs with adj. p>0.05, Wald test.
Fig. 5:
Fig. 5:. KRABINER binds to mariner TIRs in bat cells.
A) Heatmaps summarizing merged, library-size and input-normalized ChIP-seq coverage of each KRABINER variant centered on the summit of WT (top), WTXmutKRAB (middle), and WTXmutDBD (bottom) peak sets. B) Metaplot summarizing normalized ChIP-seq coverage of each KRABINER variant over all genomic Mlmar1 elements (top). The top enriched motif in the WT and WT & mutKRAB peak sets is identical to the predicted bipartite binding motif within the Mlmar1 mariner TIR (bottom) (HOMER). C) Enrichment of transposon families in WT only, WTXmutKRAB, and WTXmutDBD peak sets. Observed = number of overlaps between a TE family and a given peak set. Expected = # of expected overlaps between a TE family and a given peak set after shuffling TE locations 1000 times. p values determined using the binomial distribution, n=1000 shuffles. WT= purple; mutDBD=pink; mutKRAB=green. WT = wild-type, DBD = DNA binding domain, ORF = open reading frame, TIR = terminal inverted repeat.
Fig. 6
Fig. 6. KRABINER regulates a network of genes and TREs in bat cells.
A) MA plot summarizing changes in TRE transcription upon over-expression of WT KRABINER. Non-specific (black) refers to changes in TRE transcription that are shared between over-expression of WT KRABINER and one or both mutant KRABINER variants. Unchanged (gray) refers to TREs with adj. p>0.05, Wald test. B) Proposed model for KRABINER’s function as a transcription factor in bats. KRABINER directly binds to mariner TIRs within the genome and leads to direct downregulation of a subset of TREs. KRABINER also binds to other genomic regions, and indirectly regulates a number of genes and TREs. Resc. = rescue; OE = overexpression; TIR = terminal inverted repeats; Tpase = transposase.

Comment in

  • New genes from borrowed parts.
    Wacholder A, Carvunis AR. Wacholder A, et al. Science. 2021 Feb 19;371(6531):779-780. doi: 10.1126/science.abf8493. Science. 2021. PMID: 33602841 No abstract available.
  • Capturing transposases for new proteins.
    Koch L. Koch L. Nat Rev Genet. 2021 May;22(5):266-267. doi: 10.1038/s41576-021-00347-7. Nat Rev Genet. 2021. PMID: 33658661 No abstract available.

References

    1. Ohno S, Evolution by Gene Duplication. (Springer-Verlag, Berlin, Heidelberg, 1970).
    1. Ruddle FH et al., Evolution of Hox Genes. Annual review of genetics 28, 423–442 (1994). - PubMed
    1. Bouchard M, Schleiffer A, Eisenhaber F, Busslinger M, PaxGenes: Evolution and Function. (John Wiley & Sons, Ltd, Chichester, UK, 2008), vol. 20, pp. 5736.
    1. Deng C, Cheng CHC, Ye H, He X, Chen L, Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci USA 107, 21593 (2010). - PMC - PubMed
    1. Lynch VJ, Inventing an arsenal: adaptive evolution and neofunctionalization of snake venom phospholipase A2 genes. BMC Evolutionary Biology 7, 2–2 (2007). - PMC - PubMed

Publication types