Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2018 Aug 1;35(8):1958-1967.
doi: 10.1093/molbev/msy099.

Analysis of Genetic Variation Indicates DNA Shape Involvement in Purifying Selection

Affiliations
Comparative Study

Analysis of Genetic Variation Indicates DNA Shape Involvement in Purifying Selection

Xiaofei Wang et al. Mol Biol Evol. .

Abstract

Noncoding DNA sequences, which play various roles in gene expression and regulation, are under evolutionary pressure. Gene regulation requires specific protein-DNA binding events, and our previous studies showed that both DNA sequence and shape readout are employed by transcription factors (TFs) to achieve DNA binding specificity. By investigating the shape-disrupting properties of single nucleotide polymorphisms (SNPs) in human regulatory regions, we established a link between disruptive local DNA shape changes and loss of specific TF binding. Furthermore, we described cases where disease-associated SNPs may alter TF binding through DNA shape changes. This link led us to hypothesize that local DNA shape within and around TF binding sites is under selection pressure. To verify this hypothesis, we analyzed SNP data derived from 216 natural strains of Drosophila melanogaster. Comparing SNPs located in functional and nonfunctional regions within experimentally validated cis-regulatory modules (CRMs) from D. melanogaster that are active in the blastoderm stage of development, we found that SNPs within functional regions tended to cause smaller DNA shape variations. Furthermore, SNPs with higher minor allele frequency were more likely to result in smaller DNA shape variations. The same analysis based on a large number of SNPs in putative CRMs of the D. melanogaster genome derived from DNase I accessibility data confirmed these observations. Taken together, our results indicate that common SNPs in functional regions tend to maintain DNA shape, whereas shape-disrupting SNPs are more likely to be eliminated through purifying selection.

PubMed Disclaimer

Figures

<sc>Fig</sc>. 1.
Fig. 1.
Pipeline for evaluation of SNP effects on DNA shape. (A) Human SNPs derived from DNase-seq data were divided into three groups, 1) strongly imbalanced SNPs, 2) weakly imbalanced SNPs, and 3) SNPs without imbalance, according to their effect on DNA accessibility. Drosophila SNPs, called from 216 natural strains of D. melanogaster and located within blastoderm stage-active CRMs, were divided into two groups, a) SNPs in functional regions and b) SNPs in nonfunctional regions, based on the criteria shown in the center. The same analysis was repeated for a larger number of Drosophila SNPs in putative CRMs. (B) Example calculation of DNA shape variation for one SNP. One single-nucleotide variant at position 0 would result in DNA shape changes at the five nucleotide positions centered around the variant. First, vectors of MGW for each allele were predicted using DNAshape (Zhou et al. 2013). Euclidean distances between MGWs of the two alleles were calculated as the DNA shape variation of the SNP (see Materials and Methods).
<sc>Fig</sc>. 2.
Fig. 2.
Local effect of SNPs on MGW profiles. Effects of SNPs on their surrounding MGW patterns varied with allele type and local DNA context. MGW patterns of local DNA region for two alternate alleles of SNPs were plotted in blue and red, respectively. At one extreme of the spectrum were SNPs (A and B) that had very small effects on local MGW. At the other extreme were SNPs (C and D) that completely disrupted the local MGW geometry. Between these two extremes were SNPs that led to an intermediate extent of variation in local DNA shape, whereas potentially still affecting TF binding (E and F).
<sc>Fig</sc>. 3.
Fig. 3.
Distribution of MGW changes for strongly imbalanced SNPs, weakly imbalanced SNPs, and SNPs without imbalance in human. Distributions of ΔMGW values for imbalanced SNPs (red and green plots) were shifted rightward compared with SNPs without imbalance (blue plot). The more imbalanced the SNPs were, the larger the ΔMGW or change in DNA shape was. Asterisks are color-coded to indicate the SNP distributions being compared. Sample sizes for all groups are listed in the legend.
<sc>Fig</sc>. 4.
Fig. 4.
DNA shape variation caused by disease-associated SNPs. (A) DNA shape variation caused by SNP rs339331 in the HOXB13 binding site. HOXB13 prefers binding to a narrower MGW induced by risk allele T. (B) DNA shape variation caused by SNP rs6893009 in the PU.1 binding site. The SNP caused large variance in MGW, which was previously reported to be a predominant structural determinant of PU.1 binding. DNA shape variation caused by (C) SNP rs445 in the c-MYB binding site, (D) a SNP in the GATA3 binding site, (E) SNP rs909116 in the ER-α binding site, and (F) SNP rs6983267 in the TCF7L2 binding site.
<sc>Fig</sc>. 5.
Fig. 5.
Distributions of MGW changes for Drosophila SNPs in experimentally validated CRMs at different locations and with different MAFs. (A) Distribution of ΔMGW values for SNPs in functional and nonfunctional regions (see Materials and Methods for definition) using the DNAshape-derived MGW. Compared with the distribution for functional regions (red plot), the distribution for nonfunctional regions (blue plot) was significantly shifted rightward, indicating that SNPs induced greater changes in ΔMGW in nonfunctional than in functional regions. (B) Distribution of ΔMGW values for SNPs in functional and nonfunctional regions, using one of the shuffled MGW predictions. Using arbitrarily shuffled MGW, no signal of purifying selection emerged. (C) Distribution of ΔMGW values for SNPs with high and low MAF in functional regions. Distribution of ΔMGW values for low MAF was significantly shifted towards the right. (D) Distribution of ΔMGW values for SNPs with high and low MAF in nonfunctional regions. Distributions of these two groups exhibited no significant difference. Sample sizes for all groups are listed in the legends.
<sc>Fig</sc>. 6.
Fig. 6.
Distributions of MGW changes for Drosophila SNPs in putative CRMs at different locations and with different MAFs. (A) Distribution of ΔMGW values for SNPs in functional and nonfunctional regions (see Materials and Methods for definition) using the DNAshape-derived MGW. Compared with the distribution for functional regions (red plot), the distribution for nonfunctional regions (blue plot) was significantly shifted rightward, indicating that SNPs induced greater changes in ΔMGW in nonfunctional than in functional regions. (B) Distribution of ΔMGW values for SNPs in functional and nonfunctional regions, using one of the shuffled MGW predictions. Using arbitrarily shuffled MGW, no signal of purifying selection emerged. (C) Distribution of ΔMGW values for SNPs with high and low MAF in functional regions. Distribution of ΔMGW values for low MAF was significantly shifted towards the right. (D) Distribution of ΔMGW values for SNPs with high and low MAF in nonfunctional regions. Distributions of these two groups exhibited no significant difference. Sample sizes for all groups are listed in the legends.

Similar articles

Cited by

References

    1. Abe N, Dror I, Yang L, Slattery M, Zhou T, Bussemaker HJ, Rohs R, Mann RS.. 2015. Deconvolving the recognition of DNA shape from sequence. Cell 1612: 307–318. - PMC - PubMed
    1. Abelson JF, Kwan KY, O’Roak BJ, Baek DY, Stillman AA, Morgan TM, Mathews CA, Pauls DL, Rasin MR, Gunel M, et al. 2005. Sequence variants in SLITRK1 are associated with Tourette’s syndrome. Science 3105746: 317–320. - PubMed
    1. Alipanahi B, Delong A, Weirauch MT, Frey BJ.. 2015. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol. 338: 831–838. - PubMed
    1. Andolfatto P. 2005. Adaptive evolution of non-coding DNA in Drosophila. Nature 4377062: 1149–1152. - PubMed
    1. Barozzi I, Simonatto M, Bonifacio S, Yang L, Rohs R, Ghisletti S, Natoli G.. 2014. Coregulation of transcription factor binding and nucleosome occupancy through DNA features of mammalian enhancers. Mol Cell 545: 844–857. - PMC - PubMed

Publication types