Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Mar;47(3):235-41.
doi: 10.1038/ng.3215. Epub 2015 Feb 9.

The genomic and phenotypic diversity of Schizosaccharomyces pombe

Affiliations

The genomic and phenotypic diversity of Schizosaccharomyces pombe

Daniel C Jeffares et al. Nat Genet. 2015 Mar.

Abstract

Natural variation within species reveals aspects of genome evolution and function. The fission yeast Schizosaccharomyces pombe is an important model for eukaryotic biology, but researchers typically use one standard laboratory strain. To extend the usefulness of this model, we surveyed the genomic and phenotypic variation in 161 natural isolates. We sequenced the genomes of all strains, finding moderate genetic diversity (π = 3 × 10(-3) substitutions/site) and weak global population structure. We estimate that dispersal of S. pombe began during human antiquity (∼340 BCE), and ancestors of these strains reached the Americas at ∼1623 CE. We quantified 74 traits, finding substantial heritable phenotypic diversity. We conducted 223 genome-wide association studies, with 89 traits showing at least one association. The most significant variant for each trait explained 22% of the phenotypic variance on average, with indels having larger effects than SNPs. This analysis represents a rich resource to examine genotype-phenotype relationships in a tractable model.

PubMed Disclaimer

Figures

Figure 1
Figure 1. An overview of the strain collection
a, Geographic origins of all 161 strains analyzed. Colored circles indicate the original sources of strains used in this study, with circle sizes indicating the number of strains obtained from each site (as in scale of black circles, top left). Strains for which only an approximate source is known (e.g. Africa) lack the black border. b, principal components projection of ‘drift distance’ between strains determined using the 752 unlinked SNPs (see Methods). The color scheme is as in (a). Leupold’s 972 reference strain is indicated with an open black square; strains that are members of the non-redundant group of 57 strains have a black border; strains known to contain large structural inversions are indicated with an orange cross.
Figure 2
Figure 2. Recent dispersal of S. pombe
a, Calibration of tree nodes using dated tips. With a collection of sequences sampled over various times (blue dots) until the present day (P), we can jointly estimate the phylogenetic tree topology (in black), the rate of evolution and the age of any node in the tree, including the root, the most recent common ancestor of all strains (R, green dot). b, Root to tip distances (mutations/site × 10−3) correlate with collection date (P <10−16), showing the data has reasonable predictive power. Distances were estimated using BEAST from mitochondrial data of the 81 strains where collection dates were available, statistical details are provided in Methods. The grey line shows the linear model. c, Historic context of dispersal. The posterior probability distribution for time to most recent common ancestor (TMRCA) of the 81 collection-dated strains estimated using BEAST. The mean estimate was 340 BCE (95% confidence interval: 1875 BCE-1088 CE). Approximate historical periods are shown for context: ECP, European Colonial Period (~1500-1940 CE), HAN, Han Dynasty in China (206 BCE-220 CE), GRE, Classical Greece (400 BCE-500 BCE), EGY, First Dynasty of ancient Egypt (2890 BCE-3100 BCE), NEOLITHIC, Neolithic Era (4,500 BCE-10,000 BCE).
Figure 3
Figure 3. Relationships between genetic diversity and genome function
a, Main features of diversity in the genome, with chromosome scale in Mb on x-axis, and mitochondrial genome on right edge. Top panel, diversity (Watterson’s θ) calculated using SNPs (scale: θ×10−2). Middle panel, recombination rate (scale: LDU/Mb ×10−3 above x-axis and log(1+LDU/Mb) below x-axis). The six major recombination hotspots are indicated with red dots. Bottom panel, sites of Tf-family LTR insertions (scale: number of strains containing each insertion, with insertions present in all strains shown in light blue) in the group of 57 strains. b, Diversity described by genome annotation. Distribution of Watterson’s θ values for each 100th of genome, using only annotated sites annotated as: exons (EXO), 5’- and 3’-UTRs (5UT, 3UT), introns (INT), long non-coding RNAs (RNA), un-annotated regions (NIL), LTRs of Tf2-family transposons (LTR), one-fold (1FD) and four-fold (4FD) degenerate sites of exons. Protein-coding categories have red borders. The horizontal red lines indicate the median and interquartile range for 4FD sites, annotation classes significantly lower than this neutral proxy shaded grey. One-sided paired Mann-Whitney test P-values vs the FFD site neutral proxy were; exons, UTRs and one-fold degenerate sites all P <2×10−16, introns P = 1×10−6, lncRNAs, un-annotated regions and LTRs P >0.05. c, Diversity is negatively correlated with exon density. Diversity (θ) and proportion of each window annotated to protein-coding exons determined for 10 kb genomic windows. The Spearman rank correlation and significance are shown on top. Filled red circles: centromeric regions; filled black circles: telomeric regions (terminal 100 kb).
Figure 4
Figure 4. Phenotypes and genome-wide associations
a, Phenotypic variation of all 57 non-clonal strains, with strains in rows and phenotypes in columns. Phenotype values are normalized, according to the scale at right, missing data are colored grey. The colored panel above each row indicates the category of phenotype measurement. Categories are amino-acid concentrations (AA, red), growth on liquid media from this study (LIQ/M1, green), growth on liquid media (LIQ/M2, black), manual (SHAPE/M, blue) and automated (SHAPE/A, cyan) shape phenotypes, growth on solid media (SOL/M, magenta). Phenotypes are hierarchically clustered using phenotype values, and strains are clustered according to their genetic relatedness using tree at right inferred by fineSTRUCTURE. Strain names are colored according to their geographic origin, as in Fig. 1a. All phenotypes were measured for at least two biological replicates, values shown are generally medians from biological and technical repeats (see Methods). b, Top panel shows variants that were associated with one or more traits using the mixed model GWAS. Variants are shown as crosses (SNPs) or triangles (indels), colored by phenotype category (as above). The horizontal scale shows the physical distance in Mb. The middle panel shows, for variants significant in our primary GWAS, the meta-P-values from linear regression within populations. The lower panel shows the total number of passing variants in each 10,000 nt window of genome. Six hotspots (≥30 variants/10 kb) are indicated with green vertical bars. The orange bar shows the location of a hotspot discovered in an independent eQTL study. P-values thresholds for the mixed model are derived from permutations of traits (Methods).

Similar articles

Cited by

References

    1. Gomes FCO, et al. Physiological diversity and trehalose accumulation in Schizosaccharomyces pombe strains isolated from spontaneous fermentations during the production of the artisanal Brazilian cachaça. Can J Microbiol. 2002;48:399–406. - PubMed
    1. Brown WRA, et al. A geographically diverse collection of Schizosaccharomyces pombe isolates shows limited phenotypic variation but extensive karyotypic diversity. G3. 2011;1:615–626. - PMC - PubMed
    1. Fawcett JA, et al. Population Genomics of the Fission Yeast Schizosaccharomyces pombe. PLoS ONE. 2014;9:e104241. - PMC - PubMed
    1. Osterwalder A. Schizosaccharomyces liquefaciens n.sp., eine gegen freie schweflige Säure widerstandsfähige Gärhefe. Mitt Geb Lebensmittelunters Hyg. 1924;15:5–28.
    1. Florenzano G, Balloni W, Materassi R. Contributo alla ecologia dei lieviti Schizosaccharomyces sulle uve. Vitis. 1977;16:38–44.

Methods only references

    1. Lunter G, Goodson M. Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res. 2010 doi:10.1101/gr.111120.110. - PMC - PubMed
    1. Depristo MA, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nature Genetics. 2011;43:491–498. - PMC - PubMed
    1. Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics. 2012;44:226–232. - PMC - PubMed
    1. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010;26:589–595. - PMC - PubMed
    1. Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics. 2013;14:178–192. - PMC - PubMed

Publication types

Associated data