Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Mar 31;17(3):e1009138.
doi: 10.1371/journal.ppat.1009138. eCollection 2021 Mar.

Population genomics of the pathogenic yeast Candida tropicalis identifies hybrid isolates in environmental samples

Affiliations

Population genomics of the pathogenic yeast Candida tropicalis identifies hybrid isolates in environmental samples

Caoimhe E O'Brien et al. PLoS Pathog. .

Abstract

Candida tropicalis is a human pathogen that primarily infects the immunocompromised. Whereas the genome of one isolate, C. tropicalis MYA-3404, was originally sequenced in 2009, there have been no large-scale, multi-isolate studies of the genetic and phenotypic diversity of this species. Here, we used whole genome sequencing and phenotyping to characterize 77 isolates of C. tropicalis from clinical and environmental sources from a variety of locations. We show that most C. tropicalis isolates are diploids with approximately 2-6 heterozygous variants per kilobase. The genomes are relatively stable, with few aneuploidies. However, we identified one highly homozygous isolate and six isolates of C. tropicalis with much higher heterozygosity levels ranging from 36-49 heterozygous variants per kilobase. Our analyses show that the heterozygous isolates represent two different hybrid lineages, where the hybrids share one parent (A) with most other C. tropicalis isolates, but the second parent (B or C) differs by at least 4% at the genome level. Four of the sequenced isolates descend from an AB hybridization, and two from an AC hybridization. The hybrids are MTLa/α heterozygotes. Hybridization, or mating, between different parents is therefore common in the evolutionary history of C. tropicalis. The new hybrids were predominantly found in environmental niches, including from soil. Hybridization is therefore unlikely to be associated with virulence. In addition, we used genotype-phenotype correlation and CRISPR-Cas9 editing to identify a genome variant that results in the inability of one isolate to utilize certain branched-chain amino acids as a sole nitrogen source.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Identification of novel isolates of C. tropicalis.
(A) Genome variation among C. tropicalis isolates. Variants were identified using the Genome Analysis Toolkit HaplotypeCaller and filtered based on genotype quality (GQ) scores and read depth (DP). Variants for all 77 isolates are shown according to variant type. Isolates are labelled on the X-axis by strain ID. One isolate (C. tropicalis ct20) has mostly homozygous variants, and six isolates have very high levels of heterozygous variants. (B) Six isolates of C. tropicalis are highly divergent. Variants were called as in (A). For heterozygous SNPs, a single allele was randomly chosen using RRHS [93] and for homozygous SNPs, the alternate allele to the reference was chosen by default. This process was repeated 100 times and 100 SNP trees were drawn with RAxML using the GTRGAMMA model [94]. The best-scoring maximum likelihood tree was chosen as a reference tree and the remaining 99 trees were used as pseudo-bootstrap trees to generate a supertree. Pseudo-bootstrap values are shown as branch labels. The six divergent isolates (Cluster B) are labelled according to their country of origin (see 1C). (C) SNP phylogeny of isolates from Cluster A indicates that clade structure is not associated with geography. The phylogeny of cluster A is shown in detail. Pseudo-bootstrap values are shown as branch labels. Isolates are labelled according to their country of origin, and environmental isolates are indicated with an asterisk. The reference strain, C. tropicalis MYA-3404, is labelled. Five putative clades are highlighted with colored bubbles. These clades are supported by principal component analysis (PCA) (S3 Fig). A sixth group was also identified by PCA, encompassing the remainder of isolates in the tree (S3 Fig).
Fig 2
Fig 2. Novel C. tropicalis isolates result from hybridization.
(A) Analysis of k-mer distribution profiles reveals hybrid genomes. K-mer analysis of sequencing readsets was performed with the k-mer Analysis Toolkit (KAT [82]). For each of four divergent isolates, the number of distinct k-mers of length 27 bases (27-mers) is displayed on the Y-axis and k-mer multiplicity (depth of coverage) is displayed on the X-axis. K-mers that are present in the reference genome are shown in red, and k-mers that are absent from the reference genome are shown in black. There are two distinct peaks of k-mer coverage at approximately 50X and 100X. This pattern implies that most of the genomes are heterozygous (k-mers at 50X coverage) with few homozygous regions (k-mers at 100X coverage). Approximately half of the heterozygous k-mers in the readsets are not represented in the reference sequence. This pattern has been observed in hybrid isolates from other yeast species [25]. (B) Analysis of phased variants identifies two distinct haplotypes in divergent isolates of C. tropicalis. Variants were phased using HapCUT2 [44] into blocks covering 10–13 Mb of the genome. For each phased block, percentage difference from the reference strain in each haplotype was calculated as the number of variants divided by the length of the block. For 84–87% of the blocks, one haplotype is <0.3% different to the reference sequence and one haplotype is >1% different to the reference sequence. All phased blocks for each of the six hybrid isolates are shown as haplotype pairs, with the member of the pair more similar to the reference (haplotype A) shown in blue and the member of the pair less similar to the reference shown in orange (haplotype B) or purple (haplotype C). Percentage difference to the reference sequence is displayed on the Y-axis and position in the genome (chromosome, position (bp)) is displayed on the X-axis.
Fig 3
Fig 3. Loss of heterozygosity in C. tropicalis isolates.
(A) Hybrid and non-hybrid isolates differ in the extent of LOH across the genome. The eight largest scaffolds in the reference genome are displayed horizontally from left to right and labelled from 1 to 8. LOH blocks are shown in pink and heterozygous (“HET”) blocks are shown in green. Centromere positions are indicated with “C”, telomere positions are indicated with “T” and the rDNA locus is indicated with “R”. Isolates are labelled on the left-hand side. The re-sequenced reference strain C. tropicalis MYA-3404 (labelled as “Ref”) is shown as a representative of the non-hybrid (AA) isolates. The genomes of the AA isolates consist mostly of LOH blocks. The AA isolate C. tropicalis ct20 has undergone extensive LOH, covering >99% of the genome. In contrast, in the AB/AC isolates, the majority of the genome consists of heterozygous blocks. (B) LOH is limited to short tracts of the genome in hybrid isolates. The histograms show the frequency of LOH blocks of different lengths in the six hybrid isolates and two AA (non-hybrid) isolates the re-sequenced reference strain C. tropicalis MYA-3404 (labelled as “Ref”) and C. tropicalis ct20. Frequency is shown on a log scale on the Y-axis while length in kilobases (kb) is shown on the X-axis, with a bin width of 1000 bp. The average length of LOH blocks in the hybrid isolates ranges from 286–416 bp. A similar pattern is observed in all six hybrid isolates, i.e. a predominance of short LOH blocks, with very few long tracts of LOH. In the non-hybrid isolates (e.g. C. tropicalis MYA-3404), LOH blocks are generally longer. C. tropicalis ct20 has the longest average LOH block length (~10 kb).
Fig 4
Fig 4. Disrupting BAT22 prevents growth of C. tropicalis on branched chain amino acids as a sole nitrogen source. (A) Phenotype analysis of C. tropicalis isolates.
Growth of C. tropicalis ct04 is shown on solid media. Strains were grown in 2x2 arrays; two biological replicates (top and bottom rows), with two technical replicates each (left and right columns), of each strain were tested. C. tropicalis ct04 replicates are outlined with red boxes. C. tropicalis ct04 cannot utilize valine or isoleucine as a sole nitrogen source and also exhibits a growth defect on solid media with 2% starch or 2% sodium acetate as the sole carbon source, or on solid media without a carbon source provided. (B) Editing of BAT22. Plasmid pCT-tRNA-BAT22 was generated to edit the wild type sequence of BAT22 (CTRG_06204) using CRISPR-Cas9. The sequences of the reference C. tropicalis BAT22 (CtBAT22 (wt)), BAT22 from C. tropicalis ct04 (CtBAT22 (ct04)) and edited BAT22 (CtBAT22*) are shown. The guide sequence is highlighted with a black box, the PAM sequence is shown in bold, and the Cas9 cut site is indicated with a red scissors. C. tropicalis isolates ct44, ct09 and ct53 were transformed with pCT-tRNA-BAT22 and a repair template (RT_BAT22_2bpDel_SNP) generated by overlapping PCR using RT_BAT22_2bpDel_SNP-TOP/BOT oligonucleotides. The repair template contains two 60 bp homology arms and deletes two bases in BAT22 resulting in the same frameshift observed in C. tropicalis ct04. (C) Edited strains have defects in branched-chain amino acid metabolism. 5-fold serial dilutions of C. tropicalis ct04, ct09(wt; bat22*), ct44 (wt; bat22*) and ct53 (wt; bat22*) in the same conditions tested in (A). The edited strains cannot use valine or isoleucine as sole nitrogen sources.

Similar articles

Cited by

References

    1. Pfaller MA, Diekema DJ, Gibbs DL, Newell VA, Ellis D, Tullio V, et al.. Results from the ARTEMIS DISK Global Antifungal Surveillance Study, 1997 to 2007: a 10.5-year analysis of susceptibilities of Candida Species to fluconazole and voriconazole as determined by CLSI standardized disk diffusion. J Clin Microbiol. 2010;48: 1366–1377. 10.1128/JCM.02117-09 - DOI - PMC - PubMed
    1. Pfaller MA, Diekema DJ, Turnidge JD, Castanheira M, Jones RN. Twenty years of the SENTRY antifungal surveillance program: results for species from 1997–2016. Open Forum Infect Dis. 2019;6: S79–S94. 10.1093/ofid/ofy358 - DOI - PMC - PubMed
    1. Tan TY, Hsu LY, Alejandria MM, Chaiwarith R, Chinniah T, Chayakulkeeree M, et al.. Antifungal susceptibility of invasive Candida bloodstream isolates from the Asia-Pacific region. Med Mycol. 2016;54: 471–477. 10.1093/mmy/myv114 - DOI - PubMed
    1. Nucci M, Queiroz-Telles F, Alvarado-Matute T, Tiraboschi IN, Cortes J, Zurita J, et al.. Epidemiology of candidemia in Latin America: a laboratory-based survey. PLoS One. 2013;8: e59373. 10.1371/journal.pone.0059373 - DOI - PMC - PubMed
    1. Tan BH, Chakrabarti A, Li RY, Patel AK, Watcharananan SP, Liu Z, et al.. Incidence and species distribution of candidaemia in Asia: a laboratory-based surveillance study. Clin Microbiol Infect. 2015. pp. 946–953. 10.1016/j.cmi.2015.06.010 - DOI - PubMed

Publication types

Substances