Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jul 30;7(8):2289-309.
doi: 10.1093/gbe/evv141.

KLF/SP Transcription Factor Family Evolution: Expansion, Diversification, and Innovation in Eukaryotes

Affiliations

KLF/SP Transcription Factor Family Evolution: Expansion, Diversification, and Innovation in Eukaryotes

Jason S Presnell et al. Genome Biol Evol. .

Abstract

The Krüppel-like factor and specificity protein (KLF/SP) genes play key roles in critical biological processes including stem cell maintenance, cell proliferation, embryonic development, tissue differentiation, and metabolism and their dysregulation has been implicated in a number of human diseases and cancers. Although many KLF/SP genes have been characterized in a handful of bilaterian lineages, little is known about the KLF/SP gene family in nonbilaterians and virtually nothing is known outside the metazoans. Here, we analyze and discuss the origins and evolutionary history of the KLF/SP transcription factor family and associated transactivation/repression domains. We have identified and characterized the complete KLF/SP gene complement from the genomes of 48 species spanning the Eukarya. We have also examined the phylogenetic distribution of transactivation/repression domains associated with this gene family. We report that the origin of the KLF/SP gene family predates the divergence of the Metazoa. Furthermore, the expansion of the KLF/SP gene family is paralleled by diversification of transactivation domains via both acquisitions of pre-existing ancient domains as well as by the appearance of novel domains exclusive to this gene family and is strongly associated with the expansion of cell type complexity.

Keywords: C2H2 zinc fingers; domain architecture; domain co-occurrence network; domain evolution; domain shuffling; low-complexity regions.

PubMed Disclaimer

Figures

F<sc>ig</sc>. 1.—
Fig. 1.—
Distribution of C2H2 zinc finger proteins, KLF-DBD containing proteins, and KLF/SP proteins in representative Eukarya taxa. Rows indicate representative genomes searched. Columns indicate the total number of protein sequences that contain at least one C2H2 zinc finger using the Pfam PF00096 HMM model, the total number of protein sequences that contain the archetypical KLF-DBD, the total number of bona fide KLF sequences recovered, and the total number of SP sequences recovered. Phylogeny is based on Adl et al. (2012), Derelle and Lang (2012), Dunn et al. (2008), Ryan et al. (2013), and Sebé-Pedrós et al. (2013).
F<sc>ig</sc>. 2.—
Fig. 2.—
Combined gene tree estimates for the concatenated KLF/SP data set using Bayesian criterion (MrBayes) and ML criterion (RAxML). Gray node labels indicate congruent topology with BPP support = 84%. Black node labels indicate congruent topology with BPP support ≥90%. Clades collapsed to triangles indicate congruent topologies with BPP support ≥90%. The single highly divergent ctenophore MleKLFX sequence clusters with nonfilozoan KLF-DBD presumably due to long-branch attraction. Bayesian and ML trees with support values and branch lengths are available in supplementary figs. S2 and S3, Supplementary Material online.
F<sc>ig</sc>. 3.—
Fig. 3.—
Phylogenetic distribution of transactivation/repression domains and LCRs associated with KLF/SP proteins. The + indicates the presence of the corresponding domain or LCR in at least one KLF/SP protein in the indicated taxa. Only filozoan lineages containing bona fide KLF/SP proteins are shown. An asterix indicates that RNA-seq data were used for that species. Phylogeny is based on Dunn et al. (2008), Ryan et al. (2013), and Sebé-Pedrós et al. (2013).
F<sc>ig</sc>. 4.—
Fig. 4.—
KLF/SP protein domain co-occurrence networks. In all networks, each circle represents a transactivation/repression domain or an LCR. A line connecting two domains indicates a co-occurrence of those two domains. Domains are arranged in approximately the same 5′–3′ spatial orientation as they appear encoded in KLF/SP sequences. (A) General network diagram showing connectivity and unidirectional spatial relationships between transactivation domains among filozoan KLF/SPs. Blue arrows represent connectivity upstream of the KLF-DBD; the gold arrow represents connectivity downstream of the KLF-DBD. (B–I) KLF/SP co-occurrence networks from different taxonomic groups. Circle size indicates the relative frequency of occurrence in the network, with the KLF-DBD always representing 100%. Circle color follows the same convention as seen in figure 3. Repeated domains were counted as occurring only once. Lines connecting circles indicate the presence of that specific domain pair co-occurrence in at least one KLF/SP. Line width indicates the frequency of domain pair co-occurrence. Only LCR domains which are found N-terminal of the KLF-DBD are represented in these networks (supplementary fig. S4, Supplementary Material online). (B) Complete filozoan KLF/SP network. (C) Representative unicellular KLF/SP network. (D) KLF/SP network from nonbilaterian metazoans. (E) Invertebrate bilaterian KLF/SP network. (F) Vertebrate KLF/SP network. (G-H) Representative ctenophoran and poriferan KLF/SP networks for comparison with each other and with the network in D. (I) Ciona KLF/SP network for comparison with the networks in E and F. (J, K) Co-occurrence network maps for the KLF subfamily and SP subfamily mapped onto the filozoan phylogeny (Dunn et al. 2008; Ryan et al. 2013) for evolutionary comparison. Each network represents a composite for the taxonomic group indicated. (J) Co-occurrence maps for domains found in the KLF subfamily. (K) Co-occurrence maps for domains found in the SP subfamily. The unicellular filozoan genomes and ctenophore genomes do not contain SP genes.
F<sc>ig</sc>. 5.—
Fig. 5.—
Phylogenetic distribution of explicit domain architectures represented among KLF/SP proteins. The key at lower left identifies LCRs and transactivation/repression domains used to determine domain architectures. The protein schematics along lower right represent the particular combinations of domains and LCRs with the KLF-DBD that define each specific KLF/SP protein architecture. All groups, except for the ancient unicellular KLF architecture recovered, are named according to established human KLF/SP paralogy groups that conform to each specific architecture. The three C-terminal zinc fingers of the KLF-DBD are indicated with grey boxes labeled zf1, zf2, and zf3. Architecture schematics are not to scale.
F<sc>ig</sc>. 6.—
Fig. 6.—
Inferred relationships between key events during the evolution and expansion of the KLF/SP gene family. Symbol key is at upper left. Colored rectangles represent the origin of particular transactivation/repression domains or LCRs co-occurring with the KLF-DBD (fig. 4). Yellow hexagons represent the origin of specific KLF/SP domain architectures (fig. 5). A black X over a hexagon represents the loss of specific domain architecture. Colored triangles represent the presence of specific transactivation domain motifs within whole eukaryote genomes to the exclusion of the KLF/SP gene family. (A) We infer the origin of the KLF-DBD in the opisthokont stem lineage prior to the divergence of the Holomycota. However, bona fide KLF gene architectures do not appear until the divergence of the filozoan lineage (KLF origin). The ancient unicellular KLF domain architecture is not recovered in metazoan lineages. The ancient PVDLS, SID, Btd box, and R3 domains were recovered, to the exclusion of KLF/SPs, in all eukaryote genomes searched. Notably, the Btd box was not recovered in Saccharomyces and Encephalitozoon fungal genomes. Our analysis suggests that the origin of the SP subfamily is in the metazoan stem lineage prior to the divergence of the poriferans; it is not present in the ctenophorans. The SP box motif only appears in SP genes in poriferans and is not found in additional genes until the divergence of Trichoplax. The R2 repressor domain appears to be a de novo innovation restricted to KLF genes in the vertebrate stem lineage, contributing to the KLF10/11 architecture class. Composite domain co-occurrence maps for each taxonomic group are shown to the right of the tree. Representative examples of putative domain shuffling events during the evolution and expansion of the KLF/SP gene family. (B) An ancient Btd box and a metazoan SP gene may have contributed to the origin of the SP gene subfamily early in metazoan evolution. (C) An ancient SID likely combined with a pre-existing ancestral KLF gene to form the KLF9/13 group, also early in metazoan evolution. (D) An ancient PVDLS domain combined with a pre-existing ancestral KLF gene to form the KLF3/8/12 group. We infer an independent convergent acquisition of the PVDLS domain within a KLF gene in the Protostomia lineage (see Discussion). Domain icon colors are the same as figure 5.

References

    1. Abascal F, Zardoya R, Posada D. 2005. Prottest: selection of best-fit models of protein evolution. Bioinformatics 21:2104–2105. - PubMed
    1. Adams MD, et al. 2000. The genome sequence of Drosophila melanogaster. Science 287:2185–2195. - PubMed
    1. Adl SM, et al. 2012. The revised classification of eukaryotes. J Eukaryot Microbiol. 59:429–514. - PMC - PubMed
    1. Amemiya CT, et al. 2013. The African coelacanth genome provides insights into tetrapod evolution. Nature 496:311–316. - PMC - PubMed
    1. Aparicio S, et al. 2002. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science 297:1301–1310. - PubMed

Publication types

MeSH terms

Associated data