Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun;12(6):794-803.
doi: 10.1128/EC.00001-13. Epub 2013 Mar 29.

Detection and characterization of megasatellites in orthologous and nonorthologous genes of 21 fungal genomes

Affiliations

Detection and characterization of megasatellites in orthologous and nonorthologous genes of 21 fungal genomes

Fredj Tekaia et al. Eukaryot Cell. 2013 Jun.

Abstract

Megasatellites are large DNA tandem repeats, originally described in Candida glabrata, in protein-coding genes. Most of the genes in which megasatellites are found are of unknown function. In this work, we extended the search for megasatellites to 20 additional completely sequenced fungal genomes and extracted 216 megasatellites in 203 out of 142,121 genes, corresponding to the most exhaustive description of such genetic elements available today. We show that half of the megasatellites detected encode threonine-rich peptides predicted to be intrinsically disordered, suggesting that they may interact with several partners or serve as flexible linkers. Megasatellite motifs were clustered into several families. Their distribution in fungal genes shows that different motifs are found in orthologous genes and similar motifs are found in unrelated genes, suggesting that megasatellite formation or spreading does not necessarily track the evolution of their host genes. Altogether, these results suggest that megasatellites are created and lost during evolution of fungal genomes, probably sharing similar functions, although their primary sequences are not necessarily conserved.

PubMed Disclaimer

Figures

Fig 1
Fig 1
Distribution of megasatellites in the 21 genomes studied. Left, tree topology (68, 69). Branch lengths are arbitrary. Motif families are represented by a color code. Motifs drawn on the tree indicate their proposed time of appearance during evolution, under a parsimony hypothesis. Right, protein clusters containing two or more proteins are represented by vertical columns. Nonunique motifs are indicated by their number in a black box (see Table S1 in the supplemental material). Unique motifs are shown in gray. P2.n, all clusters containing only two proteins.
Fig 2
Fig 2
Distribution of megasatellites according to motif size. Upper panel, total number of megasatellites for each motif size. Lower panel, total number of megasatellites in each species, classified by families. The color code is the same in both panels.
Fig 3
Fig 3
Correspondence analysis showing the distribution of megasatellites (blue dots) according to the 20 amino acids on the first factorial plane. F1 and F2 are the first and second factorial axes and represent, respectively, 27% and 11% of the total information included in the analyzed data table (observed megasatellites versus their amino acid composition).
Fig 4
Fig 4
Example of similar megasatellites in two nonhomologous genes. Alignment of KLTH0C00440g and KLLA0A11935g translation products, two proteins belonging to two different clusters (P6.3 and P8.1, respectively) and containing the same peptidic motif (FLO, motif cluster M17.1 [see Table S1 in the supplemental material]), is shown. The peptidic motif is shown in red, along with the number of repeats in each protein. The N-terminal and C-terminal parts of both proteins exhibit little identity (12.9% and 14.2%, respectively), with most of the identical amino acids being serine and threonine residues, due to the compositional bias of both proteins. In comparison, both FLO motifs are very similar, despite a comparable compositional bias.
Fig 5
Fig 5
Example of different megasatellites in two homologous genes. Alignment of PODANSg8665 and AN1071 translation products, two homologous proteins (P18.1) containing different megasatellite motifs, is shown. Both proteins show very similar N-terminal parts (42.6% identity) followed by less conserved regions (12.3% identity) containing the repeated peptides. There is no homology between the peptidic motifs.

Similar articles

Cited by

References

    1. Richard GF, Kerrest A, Dujon B. 2008. Comparative genomics and molecular dynamics of DNA repeats in eukaryotes. Microbiol. Mol. Biol. Rev. 72:686–727 - PMC - PubMed
    1. Thierry A, Bouchier C, Dujon B, Richard G-F. 2008. Megasatellites: a peculiar class of giant minisatellites in genes involved in cell adhesion and pathogenicity in Candida glabrata. Nucleic Acids Res. 36:5970–5982 - PMC - PubMed
    1. Vergnaud G, Denoeud F. 2000. Minisatellites: mutability and genome architecture. Genome Res. 10:899–907 - PubMed
    1. Thierry A, Dujon B, Richard GF. 20010. Megasatellites: a new class of large tandem repeats discovered in the pathogenic yeast Candida glabrata. Cell. Mol. Life Sci. 67:671–676 - PMC - PubMed
    1. Verstrepen KJ, Jansen A, Lewitter F, Fink GR. 2005. Intragenic tandem repeats generate functional variability. Nat. Genet. 37:986–990 - PMC - PubMed

Publication types