Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 May;11(5):677-84.
doi: 10.1101/gr.gr-1640r.

Identification and characterization of the potential promoter regions of 1031 kinds of human genes

Affiliations

Identification and characterization of the potential promoter regions of 1031 kinds of human genes

Y Suzuki et al. Genome Res. 2001 May.

Abstract

To understand the mechanism of transcriptional regulation, it is essential to identify and characterize the promoter, which is located proximal to the mRNA start site. To identify the promoters from the large volumes of genomic sequences, we used mRNA start sites determined by a large-scale sequencing of the cDNA libraries constructed by the "oligo-capping" method. We aligned the mRNA start sites with the genomic sequences and retrieved adjacent sequences as potential promoter regions (PPRs) for 1031 genes. The PPR sequences were searched to determine the frequencies of major promoter elements. Among 1031 PPRs, 329 (32%) contained TATA boxes, 872 (85%) contained initiators, 999 (97%) contained GC box, and 663 (64%) contained CAAT box. Furthermore, 493 (48%) PPRs were located in CpG islands. This frequency of CpG islands was reduced in TATA(+)/Inr(+) PPRs and in the PPRs of ubiquitously expressed genes. In the PPRs of the CGM2 gene, the DRA gene, and the TM30pl genes, which showed highly colon specific expression patterns, the consensus sequences of E boxes were commonly observed. The PPRs were also useful for exploring promoter SNPs.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic representation of the construction of oligo-capped cDNA libraries. The cap structure of the mRNA was replaced with the 5′ oligonucleotide by the oligo-capping method, which consists of three enzymatic reaction steps. Bacterial alkaline phosphatase (BAP) hydrolyzes the phosphate of the 5′ ends of truncated mRNAs whose cap structures have been degraded. Tobacco acid pyrophosphatase (TAP) removes the cap structure, leaving a phosphate at the 5′ end. T4 RNA ligase, which requires a phosphate at the 5′ end as its substrate, selectively ligates the 5′ oligonucleotide to the 5′ end that originally had the cap structure. Using oligo-capped mRNA, first-strand cDNA was synthesized with dT adapter primer. After alkaline degradation of the RNA, first-strand cDNA was amplified by PCR, digested with restriction enzyme SfiI, and cloned into a plasmid vector. For further details of the procedure, see references (Suzuki et al. 1997, 2000). RNA and DNA molecules are represented by dark gray bars, the 5′ oligonucleotide by light gray boxes, and PCR primers by broken bars. (Gppp) Cap structure, (p) phosphate, (OH) hydroxyl.
Figure 2
Figure 2
Functional classification of the potential promoter regions (PPRs). The promoters were classified according to which functional category the corresponding genes had been assigned in Human Info Base (HIB; http://www.mips.biochem.mpg.de/proj/human/selec_view.html). The functional categories and population of the genes belonging to each category are shown.
Figure 3
Figure 3
Population of the TATA+Inr+, TATA+Inr, TATAInr+, and TATAInr potential promoter regions (PPRs) located in/outside of CpG islands. Solid bars represent population of PPRs located in CpG islands; shaded bars represent those outside of CpG islands in each category.
Figure 4
Figure 4
(A) Expression profiles of chitinase 3-like 1 (GeneRank no. 1), lipoamide beta (GeneRank no. 35), SUPT4H1 (GeneRank no. 316), and mitochondrial-processing peptidase beta (GeneRank no. 350) observed by iAFLP. Vertical axes represent the relative expression level; horizontal axes represent the tissue distributions. The expression level was designated so that the total values should be 30. (B) Populations of tissue-specific, ubiquitous, and middle genes located in/outside of CpG islands. Solid bars represent population of potential promoter regions (PPRs) located in CpG islands; shaded bars represent PPRs outside of CpG islands.
Figure 4
Figure 4
(A) Expression profiles of chitinase 3-like 1 (GeneRank no. 1), lipoamide beta (GeneRank no. 35), SUPT4H1 (GeneRank no. 316), and mitochondrial-processing peptidase beta (GeneRank no. 350) observed by iAFLP. Vertical axes represent the relative expression level; horizontal axes represent the tissue distributions. The expression level was designated so that the total values should be 30. (B) Populations of tissue-specific, ubiquitous, and middle genes located in/outside of CpG islands. Solid bars represent population of potential promoter regions (PPRs) located in CpG islands; shaded bars represent PPRs outside of CpG islands.
Figure 5
Figure 5
Potential promoter structures and expression profiles of the CGM2 gene, the DRA gene, and the TM30pl gene are shown. The promoter structures were predicted from corresponding potential promoter region (PPR) sequences using TFBIND. Previously reported promoter structures of the CEA gene and the BGP gene are shown at top. Consensus sequence of E box and the sequences of predicted E boxes are shown to the right of the promoter structures. The nucleotides that match the consensus sequence are underlined. Each position of the predicted E box is also shown. The expression profile observed by iAFLP is shown at right for each gene (for more details, see http://bodymap.ims.u-tokyo.ac.jp).
Figure 6
Figure 6
Identification of a SNP in the potential promoter region (PPR) of CYB5RP. The SNP position is shown by a bold letter Y (C or T) and a box. The DDBJ/EMBL/GenBank accession number of the corresponding SNP in dbSNP is shown at left. Consensus sequences of TF-binding sites predicted by TFBIND are also shown by boxes.

Similar articles

Cited by

References

    1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. - PubMed
    1. Berk AJ, Sharp PA. Sizing and mapping of early adenovirus mRNAs by gel electrophoresis of S1 endonuclease-digested hybrids. Cell. 1977;12:721–732. - PubMed
    1. Brookes AJ. The essence of SNPs. Gene. 1999;234:177–186. - PubMed
    1. Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol. 1990;212:563–578. - PubMed
    1. Costello JF, Fruhwald MC, Smiraglia DJ, Rush LJ, Robertson GP, Gao X, Wright FA, Feramisco JD, Peltomaki P, Lang JC, et al. Aberrant CpG-island methylation has non-random and tumour-type-specific patterns. Nat Genet. 2000;24:132–138. - PubMed

Publication types

Substances

Associated data