Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 11;18(1):12.
doi: 10.1186/s12915-020-0744-3.

A 19-isolate reference-quality global pangenome for the fungal wheat pathogen Zymoseptoria tritici

Affiliations

A 19-isolate reference-quality global pangenome for the fungal wheat pathogen Zymoseptoria tritici

Thomas Badet et al. BMC Biol. .

Abstract

Background: The gene content of a species largely governs its ecological interactions and adaptive potential. A species is therefore defined by both core genes shared between all individuals and accessory genes segregating presence-absence variation. There is growing evidence that eukaryotes, similar to bacteria, show intra-specific variability in gene content. However, it remains largely unknown how functionally relevant such a pangenome structure is for eukaryotes and what mechanisms underlie the emergence of highly polymorphic genome structures.

Results: Here, we establish a reference-quality pangenome of a fungal pathogen of wheat based on 19 complete genomes from isolates sampled across six continents. Zymoseptoria tritici causes substantial worldwide losses to wheat production due to rapidly evolved tolerance to fungicides and evasion of host resistance. We performed transcriptome-assisted annotations of each genome to construct a global pangenome. Major chromosomal rearrangements are segregating within the species and underlie extensive gene presence-absence variation. Conserved orthogroups account for only ~ 60% of the species pangenome. Investigating gene functions, we find that the accessory genome is enriched for pathogenesis-related functions and encodes genes involved in metabolite production, host tissue degradation and manipulation of the immune system. De novo transposon annotation of the 19 complete genomes shows that the highly diverse chromosomal structure is tightly associated with transposable element content. Furthermore, transposable element expansions likely underlie recent genome expansions within the species.

Conclusions: Taken together, our work establishes a highly complex eukaryotic pangenome providing an unprecedented toolbox to study how pangenome structure impacts crop-pathogen interactions.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Assembly of 19 complete genomes from a worldwide collection. a World map indicating the isolate names and country of origin. b Phylogenomic tree based on 50 single-copy orthologs showing reticulation using SplitsTree. c Summary of genome assembly characteristics for all isolates. The bars represent the range of minimum (shortest bar) to maximum values (longest bar) for each reported statistic. Chromosome 14–21 are accessory chromosomes. The presence or absence of accessory chromosomes in each genome is shown by green dots and empty circles for present and missing chromosomes, respectively. The linked dots for isolate YEQ92 indicate the chromosomal fusion event (see also Fig. 2)
Fig. 2
Fig. 2
Large segregating chromosomal rearrangements within the species. a Chromosome length variation expressed as the percentage of the maximum observed length for each chromosome. b Two large chromosomal rearrangements identified in the isolate YEQ92 isolated from Yemen. The upper part shows the local chromosomal synteny at the fusion locus between accessory chromosomes 15 and 16 identified in YEQ92 compared to the reference genome IPO323. Transposons are shown in red, genes from chromosome 15 in purple, genes from chromosome 16 in green and genes specific to the fusion in grey boxes, respectively. Synteny shared between chromosomes is shown in red for colinear blocks or blue for inversions. The lower part shows the whole chromosome synteny of chromosome 7 contrasting YEQ92 to the reference genome IPO323. YEQ92 misses a subtelomeric region. Transposons are shown in red and genes in grey
Fig. 3
Fig. 3
Construction and analysis of the Zymoseptoria tritici pangenome. a Proportions of core orthogroups (present in all isolates), accessory orthogroups (present ≥ 2 isolates but not all) and singletons (present in one isolate only) across the pangenome (upper-left). The proportions of core, accessory and singleton categories are shown for orthogroups coding for secreted proteins (upper-right), carbohydrate-active enzymes (CAZymes; lower-left) and effectors (lower-right). b Gene copy number variation in core orthogroups across the 19 genomes. c Pangenome gene count across six CAZyme families. Families are divided into glycoside hydrolase (GH), glycosyl transferase (GT), auxiliary activity (AA), carbohydrate esterase (CE), carbohydrate-binding modules (CBM) and polysaccharide lyase activity (PL) categories. d Pangenome categories of secondary metabolite gene clusters. e Synteny plot of succinate dehydrogenase (SDH) paralogs mediating fungicide resistance. The SDHC3 locus on chromosome 3 is shown for isolates 3D7 and Aus01 both carrying the paralog. IPO323 and 1A5 lack SDHC3. The position of the SDHC3 paralog is shown using dark arrows. Genes are coloured in grey and transposable elements in red
Fig. 4
Fig. 4
Expression polymorphism across the pangenome. a Proportion of genes showing expression > 10 counts per million (CPM) across genes categories. The frequencies are shown for orthogroups encoding putative effectors, secondary metabolite cluster genes (gene cluster), carbohydrate-active enzymes (CAZymes), secreted proteins. The frequencies are also shown for singleton, accessory and core orthogroup categories in the pangenome. b Proportion of orthogroups for which the expression coefficient of variation is > 50% [cov = sd (CPM)/mean (CPM)] among different gene and pangenome categories as in a. c Correlation of gene expression and the number of paralogs detected for the same gene per genome. The grey line shows the logarithmic regression based on the linear model log10 (CPM + 1) ~ log10 (number of paralogs). d Number of orthogroups with ≥ 10 paralogs per genome. Isolates are coloured by continent of origin
Fig. 5
Fig. 5
Transposable elements (TEs) and genome size variation. a Contribution of TEs (%) to total genome size across the 19 isolates. b Relative frequency of the 23 TE superfamilies across all genomes with 100% referring to the total TE content of the respective genome. c Contribution of TE superfamilies to core and accessory genome size across the 19 isolates. d Expression of genes affected by TE insertions (grouped by TE superfamilies; left panel) and the mean TE length in the genome (grouped by TE superfamilies; right panel)
Fig. 6
Fig. 6
Transcriptional activity of transposable elements (TEs). a TE family transcription levels across all 19 genomes expressed as log10 (CPM + 1). b Average transcription levels of TE superfamilies across all genomes expressed as average log10 (CPM + 1). c Spearman correlation matrix of four TE metrics including counts, relative frequencies, average length and transcription both at the level of TE families and superfamilies. d Variation of TE transcription (average log10 (CPM + 1)) as a function of TE counts (left panel) or average TE length (right panel). Curves in the left panel show the logarithmic linear regression given by the linear model log10 (CPM + 1) ~ log10 (TE count). The highly expressed LARD_Thrym family (RLX) is highlighted using arrows (panels a, b and d)

Similar articles

Cited by

References

    1. Tettelin H, Riley D, Cattuto C, Medini D. Comparative genomics: the bacterial pan-genome. Curr Opin Microbiol. 2008;11(5):472–477. doi: 10.1016/j.mib.2008.09.006. - DOI - PubMed
    1. Ramasamy D, Mishra AK, Lagier J-C, Padhmanabhan R, Rossi M, Sentausa E, et al. A polyphasic strategy incorporating genomic data for the taxonomic description of novel bacterial species. Int J Syst Evol Microbiol. 2014;64(Pt 2):384–391. doi: 10.1099/ijs.0.057091-0. - DOI - PubMed
    1. Rouli L, Merhej V, Fournier P-E, Raoult D. The bacterial pangenome as a new tool for analysing pathogenic bacteria. New microbes new Infect. 2015;7:72–85. doi: 10.1016/j.nmni.2015.06.005. - DOI - PMC - PubMed
    1. McInerney JO, McNally A, O’Connell MJ. Why prokaryotes have pangenomes. Nat Microbiol. 2017;2(4):17040. doi: 10.1038/nmicrobiol.2017.40. - DOI - PubMed
    1. Lefébure T, Pavinski Bitar PD, Suzuki H, Stanhope MJ. Evolutionary dynamics of complete campylobacter pan-genomes and the bacterial species concept. Genome Biol Evol. 2010;2:646–655. doi: 10.1093/gbe/evq048. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources