Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Affiliations

PMID: 17322287
PMCID: PMC1832101
DOI: 10.1101/gr.6037607

Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Yanhui Hu et al. Genome Res. 2007 Apr.

. 2007 Apr;17(4):536-43.

doi: 10.1101/gr.6037607. Epub 2007 Feb 23.

Affiliation

¹ Harvard Institute of Proteomics, Harvard Medical School, Cambridge, MA 02141, USA.

PMID: 17322287
PMCID: PMC1832101
DOI: 10.1101/gr.6037607

Abstract

The availability of an annotated genome sequence for the yeast Saccharomyces cerevisiae has made possible the proteome-scale study of protein function and protein-protein interactions. These studies rely on availability of cloned open reading frame (ORF) collections that can be used for cell-free or cell-based protein expression. Several yeast ORF collections are available, but their use and data interpretation can be hindered by reliance on now out-of-date annotations, the inflexible presence of N- or C-terminal tags, and/or the unknown presence of mutations introduced during the cloning process. High-throughput biochemical and genetic analyses would benefit from a "gold standard" (fully sequence-verified, high-quality) ORF collection, which allows for high confidence in and reproducibility of experimental results. Here, we describe Yeast FLEXGene, a S. cerevisiae protein-coding clone collection that covers over 5000 predicted protein-coding sequences. The clone set covers 87% of the current S. cerevisiae genome annotation and includes full sequencing of each ORF insert. Availability of this collection makes possible a wide variety of studies from purified proteins to mutation suppression analysis, which should contribute to a global understanding of yeast protein function.

PubMed Disclaimer

Figures

**Figure 1.**
Workflow diagram of clone production. The entire process from the design of primers to production of clone stocks is shown for the four production phases. The process started by designing primers for every ORF in the genome. The primers were used to amplify the ORFs from the genome. Subsequent amplifications with universal primers generated ORF sequences flagged by recombinational cloning sites at either end and were monitored by a diagnostic gel. The product was cloned into a recombinational cloning vector via a BP clonase reaction or In-Fusion reaction. Competent bacterial strains were transformed with the reaction mix to yield colonies that were isolated robotically, cultured in liquid medium, and stored as 15% glycerol stocks.

**Figure 2.**
Size distribution of the yeast gene clone collection as compared with the sizes of the predicted protein-coding sequences as defined by the current annotation at SGD. The cloning success rate was more than 90% for genes smaller than 1 kb and about 85% for medium sized ORFs (1–4 kb). For large ORFs (>4kb), only 36% were cloned and accepted after sequence analysis.

**Figure 3.**
Western blots of 40 known or predicted yeast transcription factors. Representative Western blot analysis of 40 known or candidate yeast TFs. N-terminally GST-tagged proteins were overexpressed in and purified from *E. coli* in high-throughput, as described in Methods. Five microliters out of 60 uL total of each purified protein were analyzed by Western blots using anti-GST antibody. Serial dilutions of recombinant GST (Sigma) were included for estimation of protein concentrations. Bands of the correct size or the positions of the expected size are indicated by a dot on the *right* side of the band.

**Figure 4.**
Whole-genome yeast intergenic microarray bound by *S. cerevisiae* Rap1. (A) Close-up view of a portion of a microarray spotted with all yeast intergenic regions, bound by Rap1 overexpressed in and purified from *E. coli* in high-throughput. Fluorescence intensities are shown in false color, with white indicating saturated signal intensity, yellow indicating high signal intensity, green indicating moderate signal intensity, and blue indicating low signal intensity. (B) Sequence logos for Rap1 DNA-binding site motifs determined from genomic DNA-binding site identification experiments. We previously performed a set of triplicate PBM experiments using Rap1 expressed in and purified from *S. cerevisiae*, resulting in 293 intergenic regions bound with a Bonferroni-corrected P value of 0.001 (Mukherjee et al. 2004). Here, as the data on Rap1 overexpressed in and purified from *E. coli* were generated by a single PBM experiment, fewer spots (77) met our significance threshold for binding. The *top* two motifs were derived from the 77 and 293 most significantly bound spots in the PBM shown in A. The third motif from the *top* was derived from our previous set of triplicate PBMs using Rap1 purified from *S. cerevisiae* (Mukherjee et al. 2004). The motif at the *bottom* was derived from all intergenic regions bound in vivo in ChIP-chip (Lee et al. 2002). Motifs were generated using BioProspector (Liu et al. 2001) and exhibited the following group specificity scores (*top* to *bottom*): 1.3 × 10⁻⁹⁷, 2.4 × 10⁻²⁰⁷, 1.1 × 10⁻²²², and 8.7 × 10⁻⁹².

See this image and copyright information in PMC

References

1. Bateman A., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Coin L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Durbin R., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Finn R.D., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Hollich V., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Griffiths-Jones S., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Khanna A., Marshall M., Moxon S., Sonnhammer E.L., Marshall M., Moxon S., Sonnhammer E.L., Moxon S., Sonnhammer E.L., Sonnhammer E.L., et al. The Pfam protein families database. Nucleic Acids Res. 2004;32:D138–D141. - PMC - PubMed
1. Berger M.F., Bulyk M.L., Bulyk M.L. Protein binding microarrays (PBMs) for the rapid, high-throughput characterization of the sequence specificities of DNA binding proteins. In: Bina M., editor. Gene mapping, discovery, and expression. The Humana Press, Inc.; Totowa, NJ: 2006. pp. 245–260. - PMC - PubMed
1. Bulyk M.L., Gentalen E., Lockhart D.J., Church G.M., Gentalen E., Lockhart D.J., Church G.M., Lockhart D.J., Church G.M., Church G.M. Quantifying DNA-protein interactions by double-stranded DNA arrays. Nat. Biotechnol. 1999;17:573. - PubMed
1. Bulyk M.L., Huang X., Choo Y., Church G.M., Huang X., Choo Y., Church G.M., Choo Y., Church G.M., Church G.M. Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. 2001;98:7158–7163. - PMC - PubMed
1. Butcher R.A., Bhullar B.S., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Bhullar B.S., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Perlstein E.O., Marsischky G., LaBaer J., Schreiber S.L., Marsischky G., LaBaer J., Schreiber S.L., LaBaer J., Schreiber S.L., Schreiber S.L. Microarray-based method for monitoring yeast overexpression strains reveals small-molecule targets in TOR pathway. Nat. Chem. Biol. 2006;2:103–109. - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Affiliation

Approaching a complete repository of sequence-verified protein-encoding clones for Saccharomyces cerevisiae

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases