Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 9;33(19):4098-4110.e3.
doi: 10.1016/j.cub.2023.08.039. Epub 2023 Sep 11.

Extending the reach of homology by using successive computational filters to find yeast pheromone genes

Affiliations

Extending the reach of homology by using successive computational filters to find yeast pheromone genes

Sriram Srikant et al. Curr Biol. .

Abstract

The mating of fungi depends on pheromones that mediate communication between two mating types. Most species use short peptides as pheromones, which are either unmodified (e.g., α-factor in Saccharomyces cerevisiae) or C-terminally farnesylated (e.g., a-factor in S. cerevisiae). Peptide pheromones have been found by genetics or biochemistry in a small number of fungi, but their short sequences and modest conservation make it impossible to detect homologous sequences in most species. To overcome this problem, we used a four-step computational pipeline to identify candidate a-factor genes in sequenced genomes of the Saccharomycotina, the fungal clade that contains most of the yeasts: we require that candidate genes have a C-terminal prenylation motif, are shorter than 100 amino acids long, and contain a proteolytic-processing motif upstream of the potential mature pheromone sequence and that closely related species contain highly conserved homologs of the potential mature pheromone sequence. Additional manual curation exploits the observation that many species carry more than one a-factor gene, encoding identical or nearly identical pheromones. From 332 Saccharomycotina genomes, we identified strong candidate pheromone genes in 241 genomes, covering 13 clades that are each separated from each other by at least 100 million years, the time required for evolution to remove detectable sequence homology among small pheromone genes. For one small clade, the Yarrowia, we demonstrated that our algorithm found the a-factor genes: deleting all four related genes in the a-mating type of Yarrowia lipolytica prevents mating.

Keywords: Yarrowia; gene annotation; pheromones; small peptides; yeast mating.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

Figure 1.
Figure 1.. S. cerevisiae pheromones are produced by cleaving and modifying precursor peptides.
(A) Haploid S. cerevisiae have two mating types, a and α. Their mating with each other is initiated by the secretion of diffusible peptide pheromones that are recognized by G protein-coupled receptors (GPCRs): a-cells (magenta) secrete the lipidated peptide pheromone a-factor, which is recognized by the a-factor receptor expressed on α-cells, while α-cells (cyan) secrete the peptide pheromone α-factor, which is recognized by the α-factor receptor expressed by a-cells. (B) Mating pheromones (underlined) are encoded within precursor peptides by the MFA1 and MFA2 genes (for a-factor) and MFα1 and MFα2 genes (for α-factor). These peptides require several maturation steps before their secretion as biologically active molecules. (C) The modifications of initial products of the MFA1 and MFA2 genes that produce a-factor. Broadly, there are two stages of maturation, C-terminal modifications (S-thiol farnesylation, -AAX proteolysis (white bars), and carboxymethylation), followed by two steps of N-terminal proteolysis (two grey bars). The mature pheromone (black bar) is then exported from the cytosol through a dedicated ABC transporter, Ste6. See also Figure S1.
Figure 2.
Figure 2.. Fungal pheromone candidates can be identified by a progressive filter for small open-reading frames that are farnesylated and cleaved by a protease associated with mating.
(A) Our algorithm begins by looking for all possible short open-reading frames with C-terminal farnesylation (CAAX-stop) in the 332 available fungal genomes, resulting in 284,073 candidates. We filter again for an in-frame proteolytic-motif asparagine (N) important for the final step of maturation to produce bio-active pheromone. This resulted in 125,711 candidates. Collapsing sets of candidate sequences that result from multiple Start codons upstream of a single CAAX to one sequence from the most upstream Start codon results in 87,326 unique farnesylated loci. (Figure S2E) (B) Phylogenetic tree of all 332 sequenced yeasts across Saccharomycotina, covering clades like Debaryomycetaceae/Metschinikowiaceae, Pichiaceae, Saccharomycodaceae, Saccharomycetaceae, Phaffomycetaceae and Yarrowia that contain species important for both basic biology and industrial production. The tree is based on those yeast genome sequences with estimated relative divergence times. Selected clades are labeled. (C) Scatter plot showing that the number of unique pheromone candidate loci is linearly correlated with genome size, ranging from 100 to 650 candidates, as expected of a search for all possible pheromone candidates. (D) Scatter plot showing strong correlation between the number of annotated protein-coding genes and genome size, which ranges between 8 and 27 Mbp for the 332 yeast genomes. See also Figure S2.
Figure 3.
Figure 3.. Fungal pheromones are conserved within closely related species and are often encoded in multiple copies in a genome.
(A) Phylogenetic tree describing the evolutionary relationship of 332 yeasts with two horizons indicated. The solid grey line indicates the assumed maximum evolutionary distance at which mature pheromone sequences show detectable sequence homology, operationally defined by the divergence time of S. cerevisiae (Sc) and Kluveromyces lactis (Kl), whose pheromones show detectable sequence homology. The leaves corresponding to Sc, Kl and Vanderwaltozyma polyspora (Vp) are indicated on the tree. The leaves of the tree are also annotated in green for species with known pheromones prior to our work (first column) and species with pheromones identified in our work (second column). (B) Based on the pheromone homology time horizon (solid grey line in A), we separated 332 yeast genomes into 23 phylogroups of at least 2 species; there are also 32 singleton species. The most populated phylogroups correspond to the listed well-known clades where closely related species have been densely sequenced. These clades are represented in the tree by the corresponding colors. (C) Number of pheromone candidates per species plotted for each of the 23 conserved-pheromone phylogroups, where each circle corresponds to the number of candidates in a species, and each copy of a group of closely homologous sequences within a genome is counted separately. Selecting for candidates that have at least two homologous copies within the clade reduces the number of viable candidates per genome to 10–300 (center panel compared to left). Manual curation of candidates similar to known pheromones identifies the most likely pheromone(s) in each genome (right panel) for experimental validation. The curated candidates include both multiple copies of a single best candidate and multiple distinct candidates if a single best pheromone cannot be uniquely identified. There are 1–19 candidates encoded in each species for experimental testing. Some phylogroups contain too few species and thus no candidates rose above the rest through curation. See also Figure S3, S6, S7.
Figure 4.
Figure 4.. Mature pheromones are peptides between 6–20 amino acids and encoded in multiple copies in a genome.
(A) Box-and-whisker plot of number of copies of pheromones encoded per genome categorized by the conserved-pheromones within phylogroups. The box represents the inner quartiles, the whiskers the outer quartiles and outliers are highlighted as diamonds. Thirteen phylogroups with at least two species contain a total of 812 candidates, with the number of candidates in each phylogroup listed in parentheses (also see Table 2). The taxonomic clade corresponding to phylogroups are provided for reference, with the most populated clades containing majority of the candidates (colored similar to Figures 2 and 3). For species with multiple distinct candidate pheromones, they are treated separately and the count of each in the genome is included in the distribution. (B) Box-and-whisker plot of the length of the mature region of pheromone candidates categorized by the conserved-pheromone phylogroups. The box represents the inner quartiles, the whiskers the outer quartiles and outliers are highlighted as diamonds. Mature sequences of all 812 sequences are defined between the upstream proteolysis site (N) and the C-terminal farnesylated cysteine. See also Figure S4, S6.
Figure 5.
Figure 5.. All species in Yarrowia clade of yeasts have a homologous farnesylated pheromone that is encoded in multiple copies per genome.
(A) Phylogenetic tree of Yarrowia species (and outgroup in grey) from the set 332 of sequenced yeast of genomes, along with Y. sp 30695 which is a sister species of Y. keelungensis. Analysis of the genome of Y. sp. 30695 also produced copies of an identical pheromone candidate. The translated ORFs of curated candidates from each species are aligned by the nucleotide sequence of the coding region and ordered according to the phylogenetic relationship between the species. The red shaded region represents the candidate mature pheromone sequence, showing no non-synonymous variation across Yarrowia, except for a conservative change of phenylalanine (F) to tyrosine (Y) in Y. lipolytica. (B) The mating efficiency of MATA haploid derivatives of Y. lipolytica with combinations of the four pheromone loci deleted was evaluated using a semi-quantitative mating protocol. Measurements for each genotype are represented as a group of at least 2 biological replicates each with 2 technical replicate measurements. Single- double- and triple- mutants of YlMFA genes show reduced mating, but only the quadruple deletion of all pheromone loci is deficient in mating to a comparable degree as the receptor (Ylste2Δ) and pheromone exporter (Ylste6Δ) deleted strains. See also Figure S5 and Table S3, S4.

Similar articles

Cited by

References

    1. Lücking R, Huhndorf S, Pfister DH, Plata ER, and Lumbsch HT (2017). Fungi evolved right on track. Mycologia 101, 810–822. - PubMed
    1. Jones SK Jr., and Bennett RJ (2011). Fungal mating pheromones: choreographing the dating game. Fungal Genet Biol 48, 668–676. - PMC - PubMed
    1. Kurjan J, and Herskowitz I (1982). Structure of a yeast pheromone gene (MF alpha): a putative alpha-factor precursor contains four tandem copies of mature alpha-factor. Cell 30, 933–943. - PubMed
    1. Singh A, Chen EY, Lugovoy JM, Chang CN, Hitzeman RA, and Seeburg PH (1983). Saccharomyces cerevisiae contains two discrete genes coding for the alpha-factor pheromone. Nucleic Acids Res 11, 4049–4063. - PMC - PubMed
    1. Chen P, Sapperstein SK, Choi JD, and Michaelis S (1997). Biogenesis of the Saccharomyces cerevisiae Mating Pheromone a-Factor. The Journal of Cell Biology 136, 251–269. - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources