Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Aug 10;101(32):11701-6.
doi: 10.1073/pnas.0403514101. Epub 2004 Jul 22.

5' Long serial analysis of gene expression (LongSAGE) and 3' LongSAGE for transcriptome characterization and genome annotation

Affiliations

5' Long serial analysis of gene expression (LongSAGE) and 3' LongSAGE for transcriptome characterization and genome annotation

Chia-Lin Wei et al. Proc Natl Acad Sci U S A. .

Abstract

Complete genome annotation relies on precise identification of transcription units bounded by a transcription initiation site (TIS) and a polyadenylation site (PAS). To facilitate this process, we developed a set of two complementary methods, 5' Long serial analysis of gene expression (LS) and 3'LS. These analyses are based on the original SAGE and LS methods coupled with full-length cDNA cloning, and enable the high-throughput extraction of the first and the last 20 bp of each transcript. We demonstrate that the mapping of 5'LS and 3'LS tags to the genome allows the localization of TIS and PAS. By using 537 tag pairs mapping to the region of known genes, we confirmed that >90% of the tag pairs appropriately assigned to the first and last exons. Moreover, by using tag sequences as primers for RT-PCRs, we were able to recover putative full-length transcripts in 81% of the attempts. This large-scale generation of transcript terminal tags is at least 20-40 times more efficient than full-length cDNA cloning and sequencing in the identification of complete transcription units. The apparent precision and deep coverage makes 5'LS and 3'LS an advanced approach for genome annotation through whole-transcriptome characterization.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Schematic overview of the 5′LS and 3′LS methods for mapping TISs and PASs. (A) The first and last 20-bp nucleotides of full-length transcripts were extracted as 5′LS and 3′LS tags, respectively (see the detailed protocols in Supporting Text). (B) The 5′LS and 3′LS tags were concatenated and cloned as separate 5′LS and 3′LS libraries for sequencing analysis. (C) The 5′ and 3′ tags were concurrently mapped to the assembled genome sequences to define the TIS and PAS of transcripts and determine expression levels.
Fig. 2.
Fig. 2.
Mapping positions of 5′LS tags relative to TISs and 3′LS tags relative to PASs of RefSeq mRNA on genome sequences. The position of each 5′LS and 3′LS tag is indicated by the number of base pairs relative to the corresponding known RefSeq sequence. Negative numbers on the horizontal axis indicate that tags are either downstream of known TISs (for 5′LS tags), or upstream of known PASs (for 3′LS tags). Positive numbers indicate that tags are either upstream of known TISs (for 5′LS tags), or downstream of known PASs (for 3′LS tags). Values above each bar represent the number of tags within that particular range (in bp) in relation to known TISs and PASs.
Fig. 3.
Fig. 3.
Transcription units identified by paired-tag analysis. (A) Tag pair 5′LS822/3′LS7959 mapped closely to a predicted gene (ENSMUST62006.1) on chromosome 11. (B) Tag pair 5′LS2834/3′LS9655 identified a possible splice variant of eukaryotic translation initiation factor 3 subunit (AK076165) on chromosome 7. (C) Tag pair 5′LS2594 and 3′LS7006 identified a transcript of Tdgf1 teratocarcinoma-derived growth factor on chromosome 9. (Insets) RT-PCR validations of these putative transcript units; primary PCR products are to the left of secondary PCR products.

References

    1. Adams, M. D., Celniker, S. E., Holt, R. A., Evans, C. A., Gocayne, J. D., Amanatides, P. G., Scherer, S. E., Li, P. W., Hoskins, R. A., Galle, R. F., et al. (2000) Science 287, 2185–2195. - PubMed
    1. Aparicio, S., Chapman, J., Stupka, E., Putnam, N., Chia, J. M., Dehal, P., Christoffels, A., Rash, S., Hoon, S., Smit, A., et al. (2002) Science 297, 1301–1310. - PubMed
    1. Waterston, R. H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J. F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. (2002) Nature 420, 520–562. - PubMed
    1. Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., Smith, H. O., Yandell, M., Evans, C. A., Holt, R. A., et al. (2001) Science 291, 1304–1351. - PubMed
    1. Lander, E. S., Linton, L. M., Birren, B., Nusbaum, C., Zody, M. C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001) Nature 409, 860–921. - PubMed

Publication types

LinkOut - more resources