Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep;12(9):733-40.
doi: 10.1016/j.dnarep.2013.06.001. Epub 2013 Jul 5.

Hypothesis driven single nucleotide polymorphism search (HyDn-SNP-S)

Affiliations

Hypothesis driven single nucleotide polymorphism search (HyDn-SNP-S)

Rebecca J Swett et al. DNA Repair (Amst). 2013 Sep.

Erratum in

  • DNA Repair (Amst). 2014 Mar;15:60

Abstract

The advent of complete-genome genotyping across phenotype cohorts has provided a rich source of information for bioinformaticians. However the search for SNPs from this data is generally performed on a study-by-study case without any specific hypothesis of the location for SNPs that are predictive for the phenotype. We have designed a method whereby very large SNP lists (several gigabytes in size), combining several genotyping studies at once, can be sorted and traced back to their ultimate consequence in protein structure. Given a working hypothesis, researchers are able to easily search whole genome genotyping data for SNPs that link genetic locations to phenotypes. This allows a targeted search for correlations between phenotypes and potentially relevant systems, rather than utilizing statistical methods only. HyDn-SNP-S returns results that are less data dense, allowing more thorough analysis, including haplotype analysis. We have applied our method to correlate DNA polymerases to cancer phenotypes using four of the available cancer databases in dbGaP. Logistic regression and derived haplotype analysis indicates that ~80SNPs, previously overlooked, are statistically significant. Derived haplotypes from this work link POLL to breast cancer and POLG to prostate cancer with an increase in incidence of 3.01- and 9.6-fold, respectively. Molecular dynamics simulations on wild-type and one of the SNP mutants from the haplotype of POLL provide insights at the atomic level on the functional impact of this cancer related SNP. Furthermore, HyDn-SNP-S has been designed to allow application to any system. The program is available upon request from the authors.

Keywords: Biomarkers; Cancer; DNA polymerases; Molecular dynamics; SNP search.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement

The authors declare that there are no conflicts of interest.

Figures

Fig. 1
Fig. 1
Flowchart of the HyDn-SNPs method. Upon development of a hypothesis, researchers select GWAS studies with relevant phenotypes, and obtain locations of the genes of interest. Following application of the algorithm, SNPs can be separated by intronic or exonic. Further analysis can be performed by in vitro validation or computational studies.
Fig. 2
Fig. 2
Edge-node network of the HyDn-SNPs results. Phenotypes and polymerases are shown as nodes, edges are weighted by total number of SNPs connecting each phenotype to each polymerase. This is also available as an interactive map at http://www.chem.wayne.edu/cisnerosgroup/gexf-js2/index2.html.
Fig. 3
Fig. 3
(A) Overlay of Polλ in the binary and ternary conformations. DNA is shown in light blue, and the Loop 1 is shown in purple. (B) Differences in Loop 1 orientation between the two conformations. Distance between position 438 and Loop 1 following an interpolation between the two structures at its furthest (Panel D) and closest (Panel C) approaches. (For interpretation of the references to color in this artwork, the reader is referred to the web version of the article.)
Fig. 4
Fig. 4
Correlation difference plots for the binary (A) and ternary (B) conformations relative to the wild type. Increases in correlation are shown in orange, while increases in anti-correlated motions are shown in blue. In both cases, alterations in the correlation plots are visible, more notably in the ternary complex. The highest values from the ternary complex correlation plots were mapped back to the residues affected, and are colored orange in Panel C. Notably many of these residues are on Loop 1. Panel D shows the individual correlation values for each of the residues in Loop 1. While the binary complex shows moderate alteration on several, the ternary complex shows considerable differences for several residues, particularly between residues 469 and 472. (For interpretation of the references to color in this artwork, the reader is referred to the web version of the article.)
Fig. 5
Fig. 5
GMD plots for the binary and ternary complex simulations in both wild type and mutant form. No drastic differences are apparent between the four simulations indicating that all four are showing the same general level of physical activity. This indicates that the overall motions of the polymerase are not perturbed. In light of the data presented in Fig. 4, this indicates that the significant alterations in conformational space are restricted to the Loop 1 region.

Similar articles

Cited by

References

    1. The Human Genome Project: 10 years later. Lancet. 2010;375:2194. - PubMed
    1. Caskey CT. Presymptomatic diagnosis: a first step toward genetic health care. Science. 1993;262:48–49. - PubMed
    1. Caskey CT. Using genetic diagnosis to determine individual therapeutic utility. Annu Rev Med. 2010;61:1–15. - PubMed
    1. Peakall D, Shugart L. The Human Genome Project (HGP) Ecotoxicology. 2002;11:7. - PubMed
    1. Rossiter BJ, Caskey CT. Presymptomatic testing for genetic diseases of later life: pharmacoepidemiological considerations. Drugs Aging. 1995;7:117–130. - PubMed

Publication types

LinkOut - more resources