Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Feb 20;9(1):e02275-17.
doi: 10.1128/mBio.02275-17.

A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

Affiliations

A Near-Complete Haplotype-Phased Genome of the Dikaryotic Wheat Stripe Rust Fungus Puccinia striiformis f. sp. tritici Reveals High Interhaplotype Diversity

Benjamin Schwessinger et al. mBio. .

Abstract

A long-standing biological question is how evolution has shaped the genomic architecture of dikaryotic fungi. To answer this, high-quality genomic resources that enable haplotype comparisons are essential. Short-read genome assemblies for dikaryotic fungi are highly fragmented and lack haplotype-specific information due to the high heterozygosity and repeat content of these genomes. Here, we present a diploid-aware assembly of the wheat stripe rust fungus Puccinia striiformis f. sp. tritici based on long reads using the FALCON-Unzip assembler. Transcriptome sequencing data sets were used to infer high-quality gene models and identify virulence genes involved in plant infection referred to as effectors. This represents the most complete Puccinia striiformis f. sp. tritici genome assembly to date (83 Mb, 156 contigs, N50 of 1.5 Mb) and provides phased haplotype information for over 92% of the genome. Comparisons of the phase blocks revealed high interhaplotype diversity of over 6%. More than 25% of all genes lack a clear allelic counterpart. When we investigated genome features that potentially promote the rapid evolution of virulence, we found that candidate effector genes are spatially associated with conserved genes commonly found in basidiomycetes. Yet, candidate effectors that lack an allelic counterpart are more distant from conserved genes than allelic candidate effectors and are less likely to be evolutionarily conserved within the P. striiformis species complex and Pucciniales In summary, this haplotype-phased assembly enabled us to discover novel genome features of a dikaryotic plant-pathogenic fungus previously hidden in collapsed and fragmented genome assemblies.IMPORTANCE Current representations of eukaryotic microbial genomes are haploid, hiding the genomic diversity intrinsic to diploid and polyploid life forms. This hidden diversity contributes to the organism's evolutionary potential and ability to adapt to stress conditions. Yet, it is challenging to provide haplotype-specific information at a whole-genome level. Here, we take advantage of long-read DNA sequencing technology and a tailored-assembly algorithm to disentangle the two haploid genomes of a dikaryotic pathogenic wheat rust fungus. The two genomes display high levels of nucleotide and structural variations, which lead to allelic variation and the presence of genes lacking allelic counterparts. Nonallelic candidate effector genes, which likely encode important pathogenicity factors, display distinct genome localization patterns and are less likely to be evolutionary conserved than those which are present as allelic pairs. This genomic diversity may promote rapid host adaptation and/or be related to the age of the sequenced isolate since last meiosis.

Keywords: Dikaryon; basidiomycetes; genomics; plant pathogens.

PubMed Disclaimer

Figures

FIG 1
FIG 1
The Pst-104E genome assembly is highly contiguous and complete. (A) Comparison of the Pst-104E primary and haplotig assemblies with the two most complete publicly available P. striiformis f. sp. tritici genome assemblies, Pst-78 and Pst-130. The histograms and the left y axis show log10 counts of contigs within each size bin. The dots and the right y axis show the cumulative sizes of small to large sorted contig lengths. Each dot represents a single contig of the given size, shown on the x axis. Each plot also shows the number of contigs or scaffolds, total assembly size, N50 of the assembly, and NG50 assuming a genome size of 85 Mb. NG50 is the N50 of an assembly considering the estimated genome size instead of the actual assembly size. This enables comparisons between different-sized assemblies. (B) Genome completeness was assessed using benchmarking universal single-copy orthologs (BUSCOs) for Basidiomycota (odb9) as proxy. The graph shows BUSCO results for Pst-104E primary (p), haplotig (h), and nonredundantly combined (ph) assemblies, in comparison to all publicly available P. striiformis f. sp. tritici genome assemblies with gene models, including Pst-78, Pst-130, Pst-21, Pst-43, Pst-0821, and Pst-887. The analysis was performed on the protein level, using publicly available gene models. An asterisk indicates the actual number of identified BUSCOs for the complete Pst-104E ph assembly before filtering gene models for similarity with genes related to transposable elements.
FIG 2
FIG 2
The Pst-104E genome is characterized by high levels of interhaplotype variation. (A) Summary of interhaplotype variation between primary contigs and their respective haplotigs, analyzed using Assemblytics. Each plot indicates the number of bases that are spanned by the specific variation category, which is illustrated by a cartoon. The number labeling each histogram represents the percentage of the total size of primary contigs with haplotigs that are contained within this variation type and size bin. (B and C) Two representative whole-genome alignments of primary contigs 019 and 028 with their respective haplotigs. This illustrates the large-scale variations summarized in panel A.
FIG 3
FIG 3
Allele transposition in the Pst-104E genome. (A to C) Dot plots of whole-genome alignments generated using the mummer toolset, where the x axis represents primary contig and the y axis shows the haplotig sequence. (A) The whole-genome alignments of haplotigs_027_xxx to primary contig 014. (B) The whole-genome alignment of haplotigs_027_xxx to primary contig 027. (C) The whole-genome alignment of haplotigs_014_xxx to primary contig 014. Black lines indicate alignments in the forward direction, and red lines indicate alignments in the reverse direction in the haplotig sequence. The black rectangles highlight an ~40-kb region in haplotig_027_006 that does not align to primary contig 027 yet aligns to a region in primary contig 014, which is not covered by an associated haplotig of 014. (D) Microsynteny analysis of this extended region, with primary contig 014 on top and haplotig_027_006 on the bottom. Gene models identified as alleles are labeled with their locus tag and shaded with a light blue background. Vertical gray shading illustrates the blastn identity between sequences on both contigs, according to the scale shown in the right bottom corner next to the sequence scale bar. Start and stop positions for each contig sequence are given at the start and the end of each contig.
FIG 4
FIG 4
Identification of candidate effectors based on detailed expression analysis of secreted proteins of both Pst-104E assemblies. (A) Clustering of Pst-104E secretome expression profiles for genes located on primary contigs. Blue color intensity indicates the relative expression level based on rlog-transformed read counts in spores, germinated spores, haustoria, and in wheat tissue at 6 and 9 days postinfection. For example, cluster 8 shows the lowest relative expression in spores and the highest in haustoria, compared to the other clusters. (B) Clustering of Pst-104E secretome expression profiles for genes located on haplotigs.
FIG 5
FIG 5
Candidate effector genes are spatially associated with conserved genes and with each other. (A) Nearest-neighbor gene distance density hexplots for three gene categories, including all genes, BUSCOs, and candidate effectors. Each subplot represents a distance density hexplot with the log10 3′-flanking and 5′-flanking distance to the nearest-neighboring gene plotted along the x axis and y axis, respectively. (B) Violin plots for the log10 distance to the most proximal transposable element for genes in each category without allowing for overlap. (C) Violin plots for the log10 distance to the most proximal gene in the same category for subsamples of each category equivalent to the smallest category size (n = 1,444). (D) Violin plots for the minimum distance (log10) of candidate effectors and BUSCOs to each other or a random subset of genes (n = 1,444). The P values for panels B, C, and D were calculated using the Wilcoxon rank-sum test after correction for multiple testing (Bonferroni; alpha = 0.05) on the linear distance in bases.
FIG 6
FIG 6
The candidate effector allele status influences association with conserved genes. (A) Violin plots for the log10 distance to the most proximal BUSCO for candidate effectors in each category. The Kruskal-Wallis one-way analysis of variance of all three categories showed a significant difference between the three samples (P, ~2.36e−06). (B) Violin plots for the log10 distance to the most proximal gene for candidate effectors in each category. The Kruskal-Wallis one-way analysis of variance of all three categories showed no significant difference between the three samples (P, ~0.08). The P values in panels A and B were calculated using the Wilcoxon rank-sum test after correction for multiple testing (Bonferroni; alpha = 0.05) on the linear distance in bases. *, Wilcoxon rank-sum test comparisons with interhaploid genome paralogs lacked statistical power due to the small sample size (n = 28).

References

    1. Anonymous. 2017. Stop neglecting fungi. Nat Microbiol 2:17120. doi: 10.1038/nmicrobiol.2017.120. - DOI - PubMed
    1. Spatafora JW, Aime MC, Grigoriev IV, Martin F, Stajich JE, Blackwell M. 2017. The fungal Tree of Life: from molecular systematics to genome-scale phylogenies. Microbiol Spectr 5. doi: 10.1128/microbiolspec.FUNK-0053-2016. - DOI - PMC - PubMed
    1. Kämper J, Kahmann R, Bölker M, Ma LJ, Brefort T, Saville BJ, Banuett F, Kronstad JW, Gold SE, Müller O, Perlin MH, Wösten HAB, de Vries R, Ruiz-Herrera J, Reynaga-Peña CG, Snetselaar K, McCann M, Pérez-Martín J, Feldbrügge M, Basse CW, Steinberg G, Ibeas JI, Holloman W, Guzman P, Farman M, Stajich JE, Sentandreu R, González-Prieto JM, Kennell JC, Molina L, Schirawski J, Mendoza-Mendoza A, Greilinger D, Münch K, Rössel N, Scherer M, Vraneš M, Ladendorf O, Vincon V, Fuchs U, Sandrock B, Meng S, Ho ECH, Cahill MJ, Boyce KJ, Klose J, Klosterman SJ, Deelstra HJ, Ortiz-Castellanos L, Li W, et al. 2006. Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature 444:97–101. doi: 10.1038/nature05248. - DOI - PubMed
    1. Cantu D, Govindarajulu M, Kozik A, Wang M, Chen X, Kojima KK, Jurka J, Michelmore RW, Dubcovsky J. 2011. Next generation sequencing provides rapid access to the genome of Puccinia striiformis f. sp. tritici, the causal agent of wheat stripe rust. PLoS One 6:e24230. doi: 10.1371/journal.pone.0024230. - DOI - PMC - PubMed
    1. Cantu D, Segovia V, MacLean D, Bayles R, Chen X, Kamoun S, Dubcovsky J, Saunders DG, Uauy C. 2013. Genome analyses of the wheat yellow (stripe) rust pathogen Puccinia striiformis f. sp. tritici reveal polymorphic and haustorial expressed secreted proteins as candidate effectors. BMC Genomics 14:270. doi: 10.1186/1471-2164-14-270. - DOI - PMC - PubMed

Publication types

Substances

LinkOut - more resources