Systematic recovery and analysis of full-ORF human cDNA clones
- PMID: 15489330
- PMCID: PMC528924
- DOI: 10.1101/gr.2473704
Systematic recovery and analysis of full-ORF human cDNA clones
Abstract
The Mammalian Gene Collection (MGC) consortium (http://mgc.nci.nih.gov) seeks to establish publicly available collections of full-ORF cDNAs for several organisms of significance to biomedical research, including human. To date over 15,200 human cDNA clones containing full-length open reading frames (ORFs) have been identified via systematic expressed sequence tag (EST) analysis of a diverse set of cDNA libraries; however, further systematic EST analysis is no longer an efficient method for identifying new cDNAs. As part of our involvement in the MGC program, we have developed a scalable method for targeted recovery of cDNA clones to facilitate recovery of genes absent from the MGC collection. First, cDNA is synthesized from various RNAs, followed by polymerase chain reaction (PCR) amplification of transcripts in 96-well plates using gene-specific primer pairs flanking the ORFs. Amplicons are cloned into a sequencing vector, and full-length sequences are obtained. Sequences are processed and assembled using Phred and Phrap, and analyzed using Consed and a number of bioinformatics methods we have developed. Sequences are compared with the Reference Sequence (RefSeq) database, and validation of sequence discrepancies is attempted using other sequence databases including dbEST and dbSNP. Clones with identical sequence to RefSeq or containing only validated changes will become part of the MGC human gene collection. Clones containing novel splice variants or polymorphisms have also been identified. Our approach to clone recovery, applied at large scale, has the potential to recover many and possibly most of the genes absent from the MGC collection.
Figures









Similar articles
-
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).Nucleic Acids Res. 2005 Dec 2;33(21):e185. doi: 10.1093/nar/gni184. Nucleic Acids Res. 2005. PMID: 16326860 Free PMC article.
-
Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):16899-903. doi: 10.1073/pnas.242603899. Epub 2002 Dec 11. Proc Natl Acad Sci U S A. 2002. PMID: 12477932 Free PMC article.
-
The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).Genome Res. 2004 Oct;14(10B):2121-7. doi: 10.1101/gr.2596504. Genome Res. 2004. PMID: 15489334 Free PMC article.
-
From genome to proteome: developing expression clone resources for the human genome.Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R31-43. doi: 10.1093/hmg/ddl048. Hum Mol Genet. 2006. PMID: 16651367 Review.
-
Construction of expression-ready cDNA clones for KIAA genes: manual curation of 330 KIAA cDNA clones.DNA Res. 2002 Jun 30;9(3):99-106. doi: 10.1093/dnares/9.3.99. DNA Res. 2002. PMID: 12168954 Review.
Cited by
-
A newly discovered human alpha-globin gene.Blood. 2005 Aug 15;106(4):1466-72. doi: 10.1182/blood-2005-03-0948. Epub 2005 Apr 26. Blood. 2005. PMID: 15855277 Free PMC article.
-
LongSAGE profiling of nine human embryonic stem cell lines.Genome Biol. 2007;8(6):R113. doi: 10.1186/gb-2007-8-6-r113. Genome Biol. 2007. PMID: 17570852 Free PMC article.
-
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).Nucleic Acids Res. 2005 Dec 2;33(21):e185. doi: 10.1093/nar/gni184. Nucleic Acids Res. 2005. PMID: 16326860 Free PMC article.
-
The completion of the Mammalian Gene Collection (MGC).Genome Res. 2009 Dec;19(12):2324-33. doi: 10.1101/gr.095976.109. Epub 2009 Sep 18. Genome Res. 2009. PMID: 19767417 Free PMC article.
-
Targeted discovery of novel human exons by comparative genomics.Genome Res. 2007 Dec;17(12):1763-73. doi: 10.1101/gr.7128207. Epub 2007 Nov 7. Genome Res. 2007. PMID: 17989246 Free PMC article.
References
-
- Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment search tool. J. Mol. Biol. 215: 403-410. - PubMed
-
- Boguski, M.S., Lowe, T.M., and Tolstoshev, C.M. 1993. dbEST—Database for “expressed sequence tags”. Nat. Genet. 4: 332-333. - PubMed
-
- Butterfield, Y.S., Marra, M.A., Asano, J.K., Chan, S.Y., Guin, R., Krzywinski, M.I., Lee, S.S., MacDonald, K.W., Mathewson, C.A., Olson, T.E., et al. 2002. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones. Nucleic Acids Res. 30: 2460-2468. - PMC - PubMed
WEB SITE REFERENCES
-
- http://genome.ucsc.edu/cgi-bin/hgBlat; Human BLAT Search.
-
- http://mgc.nci.nih.gov; Mammalian Gene Collection.
-
- http://www.ensembl.org; Ensembl.
-
- http://www.ncbi.nlm.nih.gov/dbEST; Expressed Sequence Tags database.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials