Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Apr;17(4):536-43.
doi: 10.1101/gr.6037607. Epub 2007 Feb 23.

Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Affiliations

Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Yanhui Hu et al. Genome Res. 2007 Apr.

Abstract

The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein-protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N- or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a "gold standard" (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Workflow diagram of clone production. The entire process from the design of primers to production of clone stocks is shown for the four production phases. The process started by designing primers for every ORF in the genome. The primers were used to amplify the ORFs from the genome. Subsequent amplifications with universal primers generated ORF sequences flagged by recombinational cloning sites at either end and were monitored by a diagnostic gel. The product was cloned into a recombinational cloning vector via a BP clonase reaction or In-Fusion reaction. Competent bacterial strains were transformed with the reaction mix to yield colonies that were isolated robotically, cultured in liquid medium, and stored as 15% glycerol stocks.
Figure 2.
Figure 2.
Size distribution of the yeast gene clone collection as compared with the sizes of the predicted protein-coding sequences as defined by the current annotation at SGD. The cloning success rate was more than 90% for genes smaller than 1 kb and about 85% for medium sized ORFs (1–4 kb). For large ORFs (>4kb), only 36% were cloned and accepted after sequence analysis.
Figure 3.
Figure 3.
Western blots of 40 known or predicted yeast transcription factors. Representative Western blot analysis of 40 known or candidate yeast TFs. N-terminally GST-tagged proteins were overexpressed in and purified from E. coli in high-throughput, as described in Methods. Five microliters out of 60 uL total of each purified protein were analyzed by Western blots using anti-GST antibody. Serial dilutions of recombinant GST (Sigma) were included for estimation of protein concentrations. Bands of the correct size or the positions of the expected size are indicated by a dot on the right side of the band.
Figure 4.
Figure 4.
Whole-genome yeast intergenic microarray bound by S. cerevisiae Rap1. (A) Close-up view of a portion of a microarray spotted with all yeast intergenic regions, bound by Rap1 overexpressed in and purified from E. coli in high-throughput. Fluorescence intensities are shown in false color, with white indicating saturated signal intensity, yellow indicating high signal intensity, green indicating moderate signal intensity, and blue indicating low signal intensity. (B) Sequence logos for Rap1 DNA-binding site motifs determined from genomic DNA-binding site identification experiments. We previously performed a set of triplicate PBM experiments using Rap1 expressed in and purified from S. cerevisiae, resulting in 293 intergenic regions bound with a Bonferroni-corrected P value of 0.001 (Mukherjee et al. 2004). Here, as the data on Rap1 overexpressed in and purified from E. coli were generated by a single PBM experiment, fewer spots (77) met our significance threshold for binding. The top two motifs were derived from the 77 and 293 most significantly bound spots in the PBM shown in A. The third motif from the top was derived from our previous set of triplicate PBMs using Rap1 purified from S. cerevisiae (Mukherjee et al. 2004). The motif at the bottom was derived from all intergenic regions bound in vivo in ChIP-chip (Lee et al. 2002). Motifs were generated using BioProspector (Liu et al. 2001) and exhibited the following group specificity scores (top to bottom): 1.3 × 10−97, 2.4 × 10−207, 1.1 × 10−222, and 8.7 × 10−92.

References

    1. Bateman A., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Marshall M., Moxon S., Sonnhammer E.L., Moxon S., Sonnhammer E.L., Sonnhammer E.L., et al. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–D141. - PMC - PubMed
    1. Berger M.F., Bulyk M.L., Bulyk M.L. Protein binding microarrays (PBMs) for the rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. In: Bina M., editor. Gene mapping, discovery, and expression. The Humana Press, Inc.; Totowa, NJ: 2006. pp. 245–260. - PMC - PubMed
    1. Bulyk M.L., Gentalen E., Lockhart D.J., Church G.M., Gentalen E., Lockhart D.J., Church G.M., Lockhart D.J., Church G.M., Church G.M. Quantifying DNA-protein interactions by double-stranded DNA arrays. Nat. Biotechnol. 1999;17:573. - PubMed
    1. Bulyk M.L., Huang X., Choo Y., Church G.M., Huang X., Choo Y., Church G.M., Choo Y., Church G.M., Church G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. 2001;98:7158–7163. - PMC - PubMed
    1. Butcher R.A., Bhullar B.S., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Bhullar B.S., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Marsischky G., LaBaer J., Schreiber S.L., LaBaer J., Schreiber S.L., Schreiber S.L. Microarray-based method for monitoring yeast overexpression strains reveals small-molecule targets in TOR pathway. Nat. Chem. Biol. 2006;2:103–109. - PubMed

Publication types

Substances