Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May;30(5):684-696.
doi: 10.1101/gr.257816.119. Epub 2020 May 18.

Polymorphic centromere locations in the pathogenic yeast Candida parapsilosis

Affiliations

Polymorphic centromere locations in the pathogenic yeast Candida parapsilosis

Mihaela Ola et al. Genome Res. 2020 May.

Abstract

Centromeres pose an evolutionary paradox: strongly conserved in function but rapidly changing in sequence and structure. However, in the absence of damage, centromere locations are usually conserved within a species. We report here that isolates of the pathogenic yeast species Candida parapsilosis show within-species polymorphism for the location of centromeres on two of its eight chromosomes. Its old centromeres have an inverted-repeat (IR) structure, whereas its new centromeres have no obvious structural features but are located within 30 kb of the old site. Centromeres can therefore move naturally from one chromosomal site to another, apparently spontaneously and in the absence of any significant changes in DNA sequence. Our observations are consistent with a model in which all centromeres are genetically determined, such as by the presence of short or long IRs or by the ability to form cruciforms. We also find that centromeres have been hotspots for genomic rearrangements in the C. parapsilosis clade.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
C. parapsilosis centromeres consist of unique mid regions surrounded by partially conserved inverted repeats (IRs). (A) Dot matrix plot comparing the putative centromere sequences in C. parapsilosis. Centromere regions (see Supplemental Table S2) were concatenated and are delineated by dark blue lines. IRs (right, IRR; left, IRL) are separated with cyan lines. Each dot represents a 25-bp window. Inverted sequences are shown in red; direct repeats, in black. (B) Diagrammatic representation of the information in A. Regions that are conserved among chromosomes are shown in black. Locations of IRs (>75% DNA sequence identity) are shown with white arrows. The mid regions are illustrated in different colors that indicate that each of them has a unique sequence. Adjacent genes are shown in gray. Each region shown is ∼10 kb in length. (C) Three copies of an HA tag were introduced into both alleles of the endogenous CSE4 gene in C. parapsilosis CLIB214 and 90-137 using CRISPR-Cas9 editing. The gene was cut between glycine 69 and glycine 70, and a repair template containing the HA tags was inserted by homologous recombination. The construct was confirmed by sequencing. (D) Visualization of the ChIP-seq signal across all chromosomes (Chr) in Cse4-tagged derivatives of C. parapsilosis CLIB214 and 90-137. (In) Input (before immunoprecipitation); IP1 and IP2 show two independent immunoprecipitation replicates from each strain. Strains derived from C. parapsilosis CLIB214 are shown in blue; from 90-137, in purple. There is one signal per chromosome in the IP samples, identifying the centromere, except for Chromosome 7, in which the rDNA locus (black asterisk) also generates a signal. The x-axis in each plot is the chromosome coordinates, and the y-axis is the number of reads mapping to a position. The maximum scale for C. parapsilosis CLIB214 is restricted to reduce the signal from the rDNA. Data are visualized using Integrative Genomics Viewer (IGV) (Thorvaldsdóttir et al. 2013).
Figure 2.
Figure 2.
Natural polymorphisms for centromere location in C. parapsilosis. The ChIP-seq data from Figure 1D is shown in more detail, and the neocentromeres are highlighted with black boxes. The order of the tracks is the same in each panel but is labeled for CEN1 only. The top track shows the location of C. parapsilosis protein coding genes. The second track shows the IR sequences only (red), with an arrow indicating the direction of the repeat. The extent of the regions conserved between chromosomes is not shown. ChIP-seq read coverage is plotted in blue for C. parapsilosis CLIB214 and in purple for C. parapsilosis 90-137. Two independent immunoprecipitation experiments were performed per strain (IP1 and IP2). Only one control is shown; the total chromatin from C. parapsilosis CLIB214 (input). The equivalent data for C. parapsilosis 90-137, and for an experiment with no tagged Cse4, are available at GEO, accession number GSE136854. The bottom track (gray) shows gene expression measured by RNA-seq during growth in YPD (taken from SRR6458364 from Turner et al. 2018). The read depth scale is indicated in brackets; the total number of reads varied in each experiment. The maximum scale for C. parapsilosis CLIB214 is restricted to 500 to reduce the signal from the rDNA. The RNA expression data are plotted on a log scale. The apparent dips in coverage at the centromeres in the input data are likely to be an artifact of the mapping procedure because reads that map to more than one site in the genome were discarded. Some reads are also incorrectly mapped to nonidentical repeat sequences, resulting in a small Cse4 signal at CEN5 in 90-137. All data are visualized using IGV.
Figure 3.
Figure 3.
Lack of rearrangements at CEN1 and CEN5 in C. parapsilosis 90-137/Cse4-HA. The Circos plot compares the eight chromosomes of the reference strain C. parapsilosis CDC317 (gray; left) to the 16 largest minION scaffolds from the Canu assembly of C. parapsilosis 90-137/Cse4-HA (white; right). Centromeres are marked by black bands. Most chromosomes are collinear, including Chromosome 1 (assembled in two contigs in 90-137, contig 2 and contig 30) and Chromosome 5 (contig 20457). There is an apparent translocation between Chromosomes 3 and 4 (contig 5 and contig 20455) at a repetitive gene that is near (but not at) the centromere. This may represent an error in the reference assembly or represent a natural structural polymorphism. Some zeros have been removed from the contig (tig) names for clarity.
Figure 4.
Figure 4.
Identification of centromeres and centromere-proximal rearrangements in C. orthopsilosis and C. metapsilosis. (A) Cartoon of centromere structure in C. orthopsilosis 90-125 (Lombardi et al. 2019b). All mid regions are unique and are shown in different colors. Sequences in black are conserved among chromosomes. IRs are shown with white arrows, and adjacent genes are shown with gray boxes. Putative transposases with DDE domains are indicated. More detail is provided in Supplemental Figure S2 and Supplemental Table S2. (B,C) Synteny relationship between C. parapsilosis and C. orthopsilosis. SynChro (Drillon et al. 2014) was used (delta value of two) to identify potential orthologs (reciprocal best hits [RBHs]), represented by colored lines in the two species, and to generate synteny maps. (B) Location of RBHs on C. orthopsilosis chromosomes. The approximate location of the putative centromeres is indicated with a gray polygon. (C) C. parapsilosis chromosomes, colored with respect to the RBH from C. orthopsilosis. The location of the C. parapsilosis centromeres are indicated with an offset white circle. The location of syntenic C. orthopsilosis centromeres is shown in more detail in Figure 5. (D) Cartoon of centromere structure in C. metapsilosis. Sequences in black are conserved among chromosomes. IRs are shown with white arrows, which are sometimes fragmented and overlapping. Mid-core regions from some CENs are similar in sequence (>60%) and are shown in the same color. Adjacent genes are shown with gray boxes. More detail is provided in Supplemental Figure S2 and Supplemental Table S2. (E,F) Synteny relationship between C. parapsilosis and C. metapsilosis. (E) Location of RBHs on C. metapsilosis chromosomes. The approximate location of the putative C. metapsilosis centromeres are indicated with a gray star (centromeres were not identified on scaffolds 4 and 8). (F) C. parapsilosis chromosomes, colored with respect to the RBH from C. metapsilosis. The location of the C. parapsilosis centromeres are indicated with a white circle. The approximate location of syntenic C. metapsilosis centromeres are shown by name and with gray stars. The same colors are used for C. orthopsilosis (B) and C. metapsilosis (E). This does not indicate that synteny is completely conserved between these species; it is a feature of SynChro, which carries out pairwise comparisons.
Figure 5.
Figure 5.
Interspecies synteny breakpoints occur at centromeres. Synteny between C. parapsilosis and C. orthopsilosis was visualized using SynChro (Drillon et al. 2014), with a delta value of two. Changing delta values had minor effects on predicted synteny. A diagrammatic representation of each C. parapsilosis chromosome, colored as in Figure 4C, is shown to scale at the top of each panel. The lower sections of each panel show the gene order around the centromere. (AH) Gene order around the eight centromeres in C. parapsilosis compared with C. orthopsilosis. The bottom row in each panel shows gene order on the C. parapsilosis chromosome, and the eight C. orthopsilosis chromosomes are shown above. Each gene is indicated by a colored dot, and RBHs are joined by lines. Syntenic blocks are surrounded with a box. Centromeres are shown by large black circles. The chromosome number is indicated at the side of each panel. The names of some genes are shown for orientation purposes. We removed the prefix “CORT0” from C. orthopsilosis genes and “CPAR2_” from C. parapsilosis genes for brevity. The color of the dots indicates the similarity of the proteins. Noninverted RBHs are shown in green, ranging from darkest (>90% similarity) to lightest (<30% similarity), and inverted orthologs are shown in red. Genes without RBH orthologs are shown in blue. Genes in gray were not identified as RBHs by SynChro but were identified using CGOB (Fitzpatrick et al. 2010; Maguire et al. 2013).
Figure 6.
Figure 6.
Organization of centromeres in Saccharomycotina species. The phylogeny is adapted from Shen et al. (2016). The size indicated on the centromeres refers to the region bound by Cse4 when known, or else when predicted bioinformatically, except for the Saccharomycetaceae, for which the size of the point centromere is shown. Solid color indicates conservation of sequence across centromeres in the same species, whereas a color gradient indicates unique sequences. IRs are shown with arrows; Ty clusters, as red and green boxes. Black circles show known (solid) or predicted (open) early-firing origins of replication (for details, see text). Point centromeres are conserved across the Saccharomycetaceae except for the Naumovozyma lineage, which has different sequences. Question marks indicate that localization of Cse4 nucleosomes has not been determined.

References

    1. Altschul S. 1990. Basic local alignment search tool. J Mol Biol 215: 403–410. 10.1016/S0022-2836(05)80360-2 - DOI - PubMed
    1. Bensasson D, Zarowiecki M, Burt A, Koufopanou V. 2008. Rapid evolution of yeast centromeres in the absence of drive. Genetics 178: 2161–2167. 10.1534/genetics.107.083980 - DOI - PMC - PubMed
    1. Burrack LS, Berman J. 2012. Neocentromeres and epigenetically inherited features of centromeres. Chromosome Res 20: 607–619. 10.1007/s10577-012-9296-x - DOI - PMC - PubMed
    1. Butler G, Rasmussen MD, Lin MF, Santos MAS, Sakthikumar S, Munro CA, Rheinbay E, Grabherr M, Forche A, Reedy JL, et al. 2009. Evolution of pathogenicity and sexual reproduction in eight Candida genomes. Nature 459: 657–662. 10.1038/nature08064 - DOI - PMC - PubMed
    1. Chatterjee G, Sankaranarayanan SR, Guin K, Thattikota Y, Padmanabhan S, Siddharthan R, Sanyal K. 2016. Repeat-associated fission yeast-like regional centromeres in the ascomycetous budding yeast Candida tropicalis. PLoS Genet 12: e1005839 10.1371/journal.pgen.1005839 - DOI - PMC - PubMed

Publication types

Supplementary concepts