Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 Oct 9;98(21):12103-8.
doi: 10.1073/pnas.201182798.

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

Affiliations

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

A A Camargo et al. Proc Natl Acad Sci U S A. .

Erratum in

  • Proc Natl Acad Sci U S A. 2004 Jan 6;101(1):414. Melo, M [corrected to Melo, MB]

Abstract

Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Percentage of full-length transcripts with at least one sequence match (A) between the ORESTES sequences derived from 24 different tissues against 15,095 full-length mRNA sequences and (B) between the ORESTES sequences derived from breast tissue against full-length transcripts expressed in breast. (C) Comparison between the percentage of match for ORESTES sequences and 5′ and 3′ESTs derived from breast tissue.
Figure 2
Figure 2
Positional distribution of ORESTES sequences within full-length transcripts.
Figure 3
Figure 3
Percentage of coverage of full-length transcripts by ORESTES sequences derived from 24 different tissues (A) and by ORESTES sequences derived from breast tissue (B). (C) Comparison between the percentage of coverage by ORESTES sequences and 5′ and 3′ ESTs derived from breast tissue.
Figure 4
Figure 4
Schematic representation of the transcript finishing approach. The sequence of four full-length transcripts corresponding to human orthologues of the mouse enhancer of polycomb 1 (EPC1) and 2 (EPC2), Notch 2, and proliferation potential-related protein were obtained by using ORESTES data in combination with genomic sequences available through the Human Genome Project (HGP). Coding regions for each of the four transcripts are represented as hatched bars and ORESTES contigs as solid bars below the genes.

Comment in

  • Navigating the human transcriptome.
    Strausberg RL, Riggins GJ. Strausberg RL, et al. Proc Natl Acad Sci U S A. 2001 Oct 9;98(21):11837-8. doi: 10.1073/pnas.221463598. Proc Natl Acad Sci U S A. 2001. PMID: 11592992 Free PMC article. Review. No abstract available.

Similar articles

  • The use of Open Reading frame ESTs (ORESTES) for analysis of the honey bee transcriptome.
    Nunes FM, Valente V, Sousa JF, Cunha MA, Pinheiro DG, Maia RM, Araujo DD, Costa MC, Martins WK, Carvalho AF, Monesi N, Nascimento AM, Peixoto PM, Silva MF, Ramos RG, Reis LF, Dias-Neto E, Souza SJ, Simpson AJ, Zago MA, Soares AE, Bitondi MM, Espreafico EM, Espindola FS, Paco-Larson ML, Simoes ZL, Hartfelder K, Silva WA Jr. Nunes FM, et al. BMC Genomics. 2004 Nov 3;5:84. doi: 10.1186/1471-2164-5-84. BMC Genomics. 2004. PMID: 15527499 Free PMC article.
  • Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags.
    de Souza SJ, Camargo AA, Briones MR, Costa FF, Nagai MA, Verjovski-Almeida S, Zago MA, Andrade LE, Carrer H, El-Dorry HF, Espreafico EM, Habr-Gama A, Giannella-Neto D, Goldman GH, Gruber A, Hackel C, Kimura ET, Maciel RM, Marie SK, Martins EA, Nobrega MP, Paco-Larson ML, Pardini MI, Pereira GG, Pesquero JB, Rodrigues V, Rogatto SR, da Silva ID, Sogayar MC, de Fátima Sonati M, Tajara EH, Valentini SR, Acencio M, Alberto FL, Amaral ME, Aneas I, Bengtson MH, Carraro DM, Carvalho AF, Carvalho LH, Cerutti JM, Corrêa ML, Costa MC, Curcio C, Gushiken T, Ho PL, Kimura E, Leite LC, Maia G, Majumder P, Marins M, Matsukuma A, Melo AS, Mestriner CA, Miracca EC, Miranda DC, Nascimento AN, Nóbrega FG, Ojopi EP, Pandolfi JR, Pessoa LG, Rahal P, Rainho CA, da Rós N, de Sá RG, Sales MM, da Silva NP, Silva TC, da Silva W Jr, Simão DF, Sousa JF, Stecconi D, Tsukumo F, Valente V, Zalcbeg H, Brentani RR, Reis FL, Dias-Neto E, Simpson AJ. de Souza SJ, et al. Proc Natl Acad Sci U S A. 2000 Nov 7;97(23):12690-3. doi: 10.1073/pnas.97.23.12690. Proc Natl Acad Sci U S A. 2000. PMID: 11070084 Free PMC article.
  • Characterization of open reading frame-expressed sequence tags generated from Bos indicus and B. taurus mammary gland cDNA libraries.
    da Mota AF, Sonstegard TS, Van Tassell CP, Shade LL, Matukumalli LK, Wood DL, Capuco AV, Brito MA, Connor EE, Martinez ML, Coutinho LL. da Mota AF, et al. Anim Genet. 2004 Jun;35(3):213-9. doi: 10.1111/j.1365-2052.2004.01139.x. Anim Genet. 2004. PMID: 15147393
  • Navigating the human transcriptome.
    Strausberg RL, Riggins GJ. Strausberg RL, et al. Proc Natl Acad Sci U S A. 2001 Oct 9;98(21):11837-8. doi: 10.1073/pnas.221463598. Proc Natl Acad Sci U S A. 2001. PMID: 11592992 Free PMC article. Review. No abstract available.
  • Expressed sequence tags: an overview.
    Parkinson J, Blaxter M. Parkinson J, et al. Methods Mol Biol. 2009;533:1-12. doi: 10.1007/978-1-60327-136-3_1. Methods Mol Biol. 2009. PMID: 19277571 Review.

Cited by

References

    1. Dunham I, Shimizu N, Roe B A, Chissoe S, Hunt A R, Collins J E, Bruskiewich R, Beare D M, Clamp M, Smink L J, et al. Nature (London) 1999;402:489–495. - PubMed
    1. Guigo R, Agarwal P, Abril J F, Burset M, Fickett J W. Genome Res. 2000;10:1631–1642. - PMC - PubMed
    1. Claverie J M. Hum Mol Genet. 1997;6:1735–1744. - PubMed
    1. Lander E S, Linton L M, Birren B, Nusbaum C, Zody M C, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. Nature (London) 2001;409:860–921. - PubMed
    1. Venter J C, Adams M D, Myers E W, Li P W, Mural R J, Sutton G G, Smith H O, Yandell M, Evans C A, Holt R A, et al. Science. 2001;291:1304–1351. - PubMed

Publication types

LinkOut - more resources