Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 Oct 9;98(21):12103-8.
doi: 10.1073/pnas.201182798.

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

Affiliations

The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

A A Camargo et al. Proc Natl Acad Sci U S A. .

Erratum in

  • Proc Natl Acad Sci U S A. 2004 Jan 6;101(1):414. Melo, M [corrected to Melo, MB]

Abstract

Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Percentage of full-length transcripts with at least one sequence match (A) between the ORESTES sequences derived from 24 different tissues against 15,095 full-length mRNA sequences and (B) between the ORESTES sequences derived from breast tissue against full-length transcripts expressed in breast. (C) Comparison between the percentage of match for ORESTES sequences and 5′ and 3′ESTs derived from breast tissue.
Figure 2
Figure 2
Positional distribution of ORESTES sequences within full-length transcripts.
Figure 3
Figure 3
Percentage of coverage of full-length transcripts by ORESTES sequences derived from 24 different tissues (A) and by ORESTES sequences derived from breast tissue (B). (C) Comparison between the percentage of coverage by ORESTES sequences and 5′ and 3′ ESTs derived from breast tissue.
Figure 4
Figure 4
Schematic representation of the transcript finishing approach. The sequence of four full-length transcripts corresponding to human orthologues of the mouse enhancer of polycomb 1 (EPC1) and 2 (EPC2), Notch 2, and proliferation potential-related protein were obtained by using ORESTES data in combination with genomic sequences available through the Human Genome Project (HGP). Coding regions for each of the four transcripts are represented as hatched bars and ORESTES contigs as solid bars below the genes.

Comment in

  • Navigating the human transcriptome.
    Strausberg RL, Riggins GJ. Strausberg RL, et al. Proc Natl Acad Sci U S A. 2001 Oct 9;98(21):11837-8. doi: 10.1073/pnas.221463598. Proc Natl Acad Sci U S A. 2001. PMID: 11592992 Free PMC article. Review. No abstract available.

References

    1. Dunham I, Shimizu N, Roe B A, Chissoe S, Hunt A R, Collins J E, Bruskiewich R, Beare D M, Clamp M, Smink L J, et al. Nature (London) 1999;402:489–495. - PubMed
    1. Guigo R, Agarwal P, Abril J F, Burset M, Fickett J W. Genome Res. 2000;10:1631–1642. - PMC - PubMed
    1. Claverie J M. Hum Mol Genet. 1997;6:1735–1744. - PubMed
    1. Lander E S, Linton L M, Birren B, Nusbaum C, Zody M C, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. Nature (London) 2001;409:860–921. - PubMed
    1. Venter J C, Adams M D, Myers E W, Li P W, Mural R J, Sutton G G, Smith H O, Yandell M, Evans C A, Holt R A, et al. Science. 2001;291:1304–1351. - PubMed

Publication types

LinkOut - more resources