Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Oct;14(10A):1821-31.
doi: 10.1101/gr.2730004. Epub 2004 Sep 13.

Pattern of sequence variation across 213 environmental response genes

Affiliations

Pattern of sequence variation across 213 environmental response genes

Robert J Livingston et al. Genome Res. 2004 Oct.

Abstract

To promote the clinical and epidemiological studies that improve our understanding of human genetic susceptibility to environmental exposure, the Environmental Genome Project (EGP) has scanned 213 environmental response genes involved in DNA repair, cell cycle regulation, apoptosis, and metabolism for single nucleotide polymorphisms (SNPs). Many of these genes have been implicated by loss-of-function mutations associated with severe diseases attributable to decreased protection of genomic integrity. Therefore, the hypothesis for these studies is that individuals with functionally significant polymorphisms within these genes may be particularly susceptible to genotoxic environmental agents. On average, 20.4 kb of baseline genomic sequence or 86% of each gene, including a substantial amount of introns, all exons, and 1.3 kb upstream and downstream, were scanned for variations in the 90 samples of the Polymorphism Discovery Resource panel. The average nucleotide diversity across the 4.2 MB of these 213 genes is 6.7 x 10(-4), or one SNP every 1500 bp, when two random chromosomes are compared. The average candidate environmental response gene contains 26 PHASE inferred haplotypes, 34 common SNPs, 6.2 coding SNPs (cSNPs), and 2.5 nonsynonymous cSNPs. SIFT and Polyphen analysis of 541 nonsynonymous cSNPs identified 57 potentially deleterious SNPs. An additional eight polymorphisms predict altered protein translation. Because these genes represent 1% of all known human genes, extrapolation from these data predicts the total genomic set of cSNPs, nonsynonymous cSNPs, and potentially deleterious nonsynonymous cSNPs. The implications for the use of these data in direct and indirect association studies of environmentally induced diseases are discussed.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Rank order of nucleotide diversity (π) in conserved noncoding, exon, intron, CDS, and 5′ flanking regions of the EGP genes. (A) Values are calculated for all 213 EGP genes for regions containing 1 kb 5′ of the transcription initiation site (green, 174 kb), exons (red, 483 kb), coding sequence (blue, 312 kb), and introns (purple, 3.26 Mb). (B) π values are calculated for 115 EGP genes that had significant amounts of intronic conserved noncoding sequence (as described in Methods) for regions containing 1 kb 5′ of the transcription initiation site (green, 91 kb), coding sequence (blue, 189 kb) and intronic conserved noncoding (orange, 170 kb), and randomly sampled intron (purple, 2.39 Mb). All categories were sorted independently before plotting. The randomly sampled intron category corresponds to the mean π value per gene obtained by 10,000 independent samplings of the same size range as the intronic conserved noncoding region for that gene. The average SEM for this sampling was 9.3E-06 with a minimum and maximum SEM of 3.12E-07 and 3.07E-05, respectively.
Figure 2
Figure 2
A GeneSNPs view of E2F2 (E2F Elongation Factor 2). E2F2 represents an average of the genes scanned for polymorphism discovery based on its size and nucleotide diversity. E2F2 has seven exons (depicted by light blue rectangles for coding and green for untranslated [UTR] sequences in the mRNA). For this gene, 24 kb was scanned for polymorphisms, which includes sequences 5′ to the first exon (∼1.7 kb) and 3′ of the last exon (∼1 kb) by amplifying 90 DNA samples using 34 overlapping amplicons (horizontal yellow bars above the gene structure). Vertical descending lines indicate the position of the SNPs identified in this sequence. The length of the vertical lines represents the frequency of the minor allele, and the color indicates whether the SNP location is in flanking (black), intronic (brown), synonymous (yellow), nonsynonymous (red), or UTR (green) sequences.
Figure 3
Figure 3
Conserved noncoding regions identified by Trafac for Cyclin D (CCND1). (A) The regulogram depicts shared cis-elements between human and mouse sequences in the context of their sequence similarity. By identifying conserved mouse-human regions with consensus cis-regulatory elements and mapping noncoding SNPs, Trafac can be used to predict the potential adverse affects of polymorphisms on the regulation of gene and expression. Mouse and human sequences are represented as horizontal bars at the top and bottom of the upper pane. The red-colored segments on these bars represent exons 1 through 3. The green-colored bars represent repeat elements. The frequencies of individual binding sites occurring in each of the sequences separately are shown as two running graphs in the top half of the pane. The percentage of sequence similarity, as determined by the BLASTZ algorithm, and the number of transcription factor-binding sites (TF BS) is represented as two separate line graphs in the lower pane. Two cis-element dense regions within the highly conserved promoter and first intronic regions are depicted as two peaks with high hit (shared cis-elements) count (indicated by the arrows). The promoter (B) and first intronic regions (C) of human and mouse CCND1 reveal a strong conservation of consensus TF-binding sites in relatively the same order of occurrence. The two gray vertical bars represent the CCND1 orthologs. The TF BS occurring in both the genes are highlighted as various colored bars drawn across the two genes. The SNPs identified in the promoter (rs3212862) and first intron (rs3212863) are indicated.
Figure 4
Figure 4
The relationship between nucleotide diversity and the number of common SNPs (A) or the number of inferred haplotypes (B) for 213 environmental response genes.
Figure 5
Figure 5
Examples of linkage disequilibrium, as measured by r2, in candidate environmental response genes. (A) BNIP1 exhibits average LD, (B) CCND2 exhibits low LD, and (C) BRCA1 exhibits strong LD. The top portion of each graphic illustrates the visual genotypes for each gene, in which each column represents a site (blue indicates common homozygote; yellow, rare homozygote; red, heterozygote; and gray, missing data) and each row represents an individual from the PDR. The bottom portion of each graphic is the LD plot for each gene, measured by r2, and depicted on a rainbow scale (white indicates weak LD; red, strong LD).

Similar articles

Cited by

References

    1. Alonso, J., Garcia-Miguel, P., Abelairas, J., Mendiola, M., Sarret, E., Vendrell, M.T., Navajas, A., and Pestana, A. 2001. Spectrum of germline RB1 gene mutations in Spanish retinoblastoma patients: Phenotypic and molecular epidemiological implications. Hum. Mutat. 17: 412-422. - PubMed
    1. Altshuler, D., Hirschhorn, J.N., Klannemark, M., Lindgren, C.M., Vohl, M.C., Nemesh, J., Lane, C.R., Schaffner, S.F., Bolk, S., Brewer, C., et al. 2000. The common PPARγ Pro12Ala polymorphism is associated with decreased risk of type 2 diabetes. Nat. Genet. 26: 76-80. - PubMed
    1. Aoufouchi, S., Flatter, E., Dahan, A., Faili, A., Bertocci, B., Storck, S., Delbos, F., Cocea, L., Gupta, N., Weill, J.C., et al. 2000. Two novel human and mouse DNA polymerases of the polX family. Nucleic Acids Res. 28: 3684-3693. - PMC - PubMed
    1. Aynacioglu, A.S., Brockmoller, J., Bauer, S., Sachse, C., Guzelbey, P., Ongen, Z., Nacak, M., and Roots, I. 1999. Frequency of cytochrome P450 CYP2C9 variants in a Turkish population and functional relevance for phenytoin. Br. J. Clin. Pharmacol. 48: 409-415. - PMC - PubMed
    1. Barth, M.L., Fensom, A., and Harris, A. 1995. Identification of seven novel mutations associated with metachromatic leukodystrophy. Hum. Mutat. 6: 170-176. - PubMed

WEB SITE REFERENCES

    1. http://locus.umdnj.edu/nigms/products/pdr.html); Coriell Institute.
    1. http://www.niehs.nih.gov/envgenom/home.htm; Environmental Genome Project.
    1. http://www.genome.utah.edu/genesnps; GeneSNPs.
    1. http://www.genomatix.de; Genomatix.
    1. http://www.ncbi.nlm.nih.gov/LocusLink; NCBI LocusLink.

Publication types

LinkOut - more resources