Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Aug 29;18(1):667.
doi: 10.1186/s12864-017-4083-x.

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters

Affiliations

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters

Jean-Félix Dallery et al. BMC Genomics. .

Abstract

Background: The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture.

Results: Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications.

Conclusion: The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.

Keywords: Colletotrichum higginsianum; Fungal genome; SMRT sequencing; accessory chromosomes; optical map; secondary metabolism genes; segmental duplication; subtelomeres; transposable elements.

PubMed Disclaimer

Conflict of interest statement

Authors information

The first two authors (J-FD and NL) contributed equally to this work.

Ethics approval and consent to participate

Seeds of the A. thaliana accessions used in this study were purchased from the Nottingham Arabidopsis Stock Centre (Nottingham University, Nottingham, UK).

Consent for publication

NA.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig 1
Fig 1
Validation of the C. higginsianum genome assembly by alignment of unitig sequences (orange) against chromosome optical maps (blue). MluI restriction sites are represented in optical maps and unitigs by vertical bars. Chromosomes 7 and 9 show discrepancies between unitigs and optical maps. These optical maps are colour-coded to highlight the break-points
Fig. 2
Fig. 2
Schematic representation of selected C. higginsianum secondary metabolism (SM) genes clusters. a Resolution of a former split SM cluster by the PacBio assembly. The new cluster 16 encompasses four contigs from the old assembly [13], two of which contain former clusters 18 and TRC3. Arrowheads: transposable elements b Comparison of cluster 19 and the depudecin cluster of Alternaria brassicicola. Protein identity is high (> 70%) and gene order and orientation are conserved except for the gene DEP6/CH63R_06317. c Comparison of cluster 46 and the fusicoccin cluster from Diaporthe amygdali [74]. In D. amygdali, genes are dispersed at two distinct loci in contrast to C. higginsianum. Protein identity is moderate to high and genes were extensively rearranged. Shading indicates syntenic blocks and genes pairs. Yellow: acetyl-transferase
Fig. 3
Fig. 3
Schematic representation of the distribution of secondary metabolism gene clusters and transposable elements across the 12 C. higginsianum chromosomes. The 5' end of unitig_7 containing the ribosomal repeats is fragmented between 13 unitigs that are too small to align with the optical map. Putative locations of the centromeres are indicated where possible
Fig. 4
Fig. 4
Waves of expression of secondary metabolism (SM) genes of C. higginsianum during infection of Arabidopsis thaliana. a Heatmap showing the expression profiles of SM key genes. Under-represented transcripts (dark green to bright green) and over-represented transcripts (dark red to bright red) are depicted as log2 relative expression index. The log2 expression levels are presented in the adjoining heatmap colour-coded from white (not expressed) to dark blue (strongly expressed). Red arrowhead: ChPKS38. b Schematic representation of the stage-specific expression of SM gene clusters. The expression of all genes within each cluster was evaluated using the Transcript Per Million (TPM) normalisation method. A cluster was considered expressed if more than 50% of genes had a TPM greater than 1% of the actin gene TPM, and |log2FC| ≥ 2, q-value ≤ 0.01. c Time-course of the expression of the pChPKS38::RFP reporter gene in planta and in vitro (cellophane) using confocal microscopy. All images are overlays of bright field and RFP channels captured with the same settings. RFP channels are projections of 15-25 0.2 μm optical sections. Co: conidium, arrowhead: appressorium, BH: biotrophic hypha, NH: necrotrophic hypha. Bars = 10 μm
Fig. 5
Fig. 5
Schematic representation of the predicted domain structure of three families of conserved repeat elements present in the subtelomeric regions of all C. higginsianum chromosomes. DTX-chim_G199 was likely derived from DHX-G198 by the insertion of a DNA transposon, whereas DHX-chim-G203 was derived from DHX-G198 by the insertion of a non-LTR retrotransposon
Fig. 6
Fig. 6
Violin plot depicting the frequency distribution of the distance (bp) between genes and the nearest transposable element (TE). The inner box plots represent the median and interquartile range of the distance for each of three gene classes. Genes located within secondary metabolism clusters (SM genes) and genes encoding candidate secreted effector proteins were located significantly closer (p < 0.001) to TEs than a random sample of genes taken from the genome as a whole
Fig. 7
Fig. 7
Circos plot showing segmental duplications (SDs). Genes are represented in green and transposable-elements in red. Gene IDs in each duplicated block (grey sectors) are given without the prefix "CH63R_". An entire secondary metabolism gene cluster (shaded blue) is duplicated in SD2. cP450: Cytochrome P450; Eff: Effector protein; Sec: Secreted; TF: Transcription Factor; TS: Terpene Synthase

Similar articles

Cited by

References

    1. Faino L, Thomma BPHJ. Get your high-quality low-cost genome sequence. Trends Plant Sci. 2014;19(5):288–291. doi: 10.1016/j.tplants.2014.02.003. - DOI - PubMed
    1. Thomma BPHJ, Seidl MF, Shi-Kunne X, Cook DE, Bolton MD, van Kan JAL, Faino L. Mind the gap; seven reasons to close fragmented genome assemblies. Fungal Genet Biol. 2016;90:24–30. doi: 10.1016/j.fgb.2015.08.010. - DOI - PubMed
    1. Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet. 2012;13(1):36–46. - PMC - PubMed
    1. Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotech. 2012;30(7):693–700. doi: 10.1038/nbt.2280. - DOI - PMC - PubMed
    1. Seidl MF, Faino L, Shi-Kunne X, van den Berg GC, Bolton MD, Thomma BP. The Genome of the Saprophytic Fungus Verticillium tricorpus Reveals a Complex Effector Repertoire Resembling That of Its Pathogenic Relatives. Mol Plant Microbe Interact. 2015;28(3):362–373. doi: 10.1094/MPMI-06-14-0173-R. - DOI - PubMed

MeSH terms

Substances

LinkOut - more resources