Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Oct 4:9:2367.
doi: 10.3389/fmicb.2018.02367. eCollection 2018.

The Landscape of Repetitive Elements in the Refined Genome of Chilli Anthracnose Fungus Colletotrichum truncatum

Affiliations

The Landscape of Repetitive Elements in the Refined Genome of Chilli Anthracnose Fungus Colletotrichum truncatum

Soumya Rao et al. Front Microbiol. .

Abstract

The ascomycete fungus Colletotrichum truncatum is a major phytopathogen with a broad host range which causes anthracnose disease of chilli. The genome sequencing of this fungus led to the discovery of functional categories of genes that may play important roles in fungal pathogenicity. However, the presence of gaps in C. truncatum draft assembly prevented the accurate prediction of repetitive elements, which are the key players to determine the genome architecture and drive evolution and host adaptation. We re-sequenced its genome using single-molecule real-time (SMRT) sequencing technology to obtain a refined assembly with lesser and smaller gaps and ambiguities. This enabled us to study its genome architecture by characterising the repetitive sequences like transposable elements (TEs) and simple sequence repeats (SSRs), which constituted 4.9 and 0.38% of the assembled genome, respectively. The comparative analysis among different Colletotrichum species revealed the extensive repeat rich regions, dominated by Gypsy superfamily of long terminal repeats (LTRs), and the differential composition of SSRs in their genomes. Our study revealed a recent burst of LTR amplification in C. truncatum, C. higginsianum, and C. scovillei. TEs in C. truncatum were significantly associated with secretome, effectors and genes in secondary metabolism clusters. Some of the TE families in C. truncatum showed cytosine to thymine transitions indicative of repeat-induced point mutation (RIP). C. orbiculare and C. graminicola showed strong signatures of RIP across their genomes and "two-speed" genomes with extensive AT-rich and gene-sparse regions. Comparative genomic analyses of Colletotrichum species provided an insight into the species-specific SSR profiles. The SSRs in the coding and non-coding regions of the genome revealed the composition of trinucleotide repeat motifs in exons with potential to alter the translated protein structure through amino acid repeats. This is the first genome-wide study of TEs and SSRs in C. truncatum and their comparative analysis with six other Colletotrichum species, which would serve as a useful resource for future research to get insights into the potential role of TEs in genome expansion and evolution of Colletotrichum fungi and for development of SSR-based molecular markers for population genomic studies.

Keywords: Colletotrichum truncatum; comparative genomics; repetitive DNA sequences; simple sequence repeats (SSRs); transposable elements (TEs); whole genome sequence.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
The percentage of repetitive elements and TE families in the total repeat component of Colletotrichum truncatum genome identified by the RepeatMasker software. LTRs occupied the largest fraction of TEs, while non-LTR and DNA elements formed the smallest. Other repetitive elements included SSRs, satellites, rDNA repeats etc.
FIGURE 2
FIGURE 2
The estimated time of insertion of LTRs in the genomes of Colletotrichum species. It was calculated based on the sequence divergence between 5′ and 3′ LTRs of the complete elements. MYA, million years ago.
FIGURE 3
FIGURE 3
The association between TEs and three functionally relevant gene categories based on permutation test using RegioneR R package. Putative effectors (A), secreted proteins (B) and genes in secondary metabolite clusters (C) were significantly closer to TEs as compared to the random genes in the genome. A set of 1,000 other genes (D) was taken as negative control for the test that showed these genes were not significantly associated with TEs.
FIGURE 4
FIGURE 4
The GC-content plots of seven Colletotrichum species arranged from (A–G) showing diverse peak heights, shape, and spacing. The GC cut-offs chosen by OcculterCut and used to classify genome segments into distinct AT-rich and GC-equilibrated region types as shown by vertical blue lines. C. orbiculare, C. graminicola and C. chlorophyti had bimodal genomes with strong GC-bias, while rest of the species showed very subtle signatures of bimodality.
FIGURE 5
FIGURE 5
Composition of SSRs in seven Colletotrichum species showing the distribution of SSRs in genomes in terms of (A) their number, (B) relative density (size of each SSR type in bp/Mb of genome), (C) relative abundance (number of SSRs/Mb of genome), and (D) the number of SSRs in coding (exons) and non-coding regions (introns, intergenic regions). Ascochyta rabiei was taken as a reference for accuracy of SSR detection. In all the species, the mononucleotide repeats were the most abundant SSR type except for C. orbiculare, in which tri- and di-nucleotide repeats were more predominant. SSRs were mainly concentrated in intergenic regions. Exons had high proportion of trinucleotide repeats in four species, whereas mononucleotide repeats were more than the trinucleotide repeats in coding region of C. truncatum.
FIGURE 6
FIGURE 6
The common repeat motifs occurring at high frequencies in the genomes of seven Colletotrichum species analysed. The mononucleotide repeats A/T and C/G were the most frequent motifs in all the species, except C. orbiculare which showed the most frequent dinucleotide motif AT.
FIGURE 7
FIGURE 7
The abundance of each amino acid encoded by trinucleotide repeat motifs in exons of five Colletotrichum species. The most abundant motifs encoded alanine or arginine in all the species. Ala, alanine; Arg, arginine; Asn, asparagine; Asp, aspartate; Cys, cysteine; Gln, glutamine; Glu, glutamate; Gly, glycine; His, histidine; Ile, isoleucine; Leu, leucine; Lys, lysine; Met, methionine; Phe, phenylalanine; Pro, proline; Ser, serine; Thr, threonine; Trp, tryptophan; Tyr, tyrosine; Val, valine.

Similar articles

Cited by

References

    1. Agarwal M., Shrivastava N., Padh H. (2008). Advances in molecular marker techniques and their applications in plant sciences. Plant Cell Rep. 27 617–631. 10.1007/s00299-008-0507-z - DOI - PubMed
    1. Amselem J., Lebrun M. H., Quesneville H. (2015). Whole genome comparative analysis of transposable elements provides new insight into mechanisms of their inactivation in fungal genomes. BMC Genomics 16:141. 10.1186/s12864-015-1347-1 - DOI - PMC - PubMed
    1. Bowen N. J., Jordan I. K. (2002). Transposable elements and eukaryotic complexity 65 transposable elements and the evolution of eukaryotic complexity. Curr. Issues Mol. Biol. 4 65–76. - PubMed
    1. Cambareri E. B., Jensen B. C., Schabtach E., Selker E. U. (1989). Repeat-induced G-C to A-T mutations in Neurospora. Science 244 1571–1575. 10.1126/science.2544994 - DOI - PubMed
    1. Campbell M. S., Holt C., Moore B., Yandell M., Campbell M. S., Holt C., et al. (2014). Genome annotation and curation using MAKER and MAKER-P. Curr. Protoc. Bioinformatics 48 4.11.1–4.11.39. 10.1002/0471250953.bi0411s48 - DOI - PMC - PubMed