Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012 Jul;13(4):495-512.
doi: 10.1093/bib/bbr070. Epub 2012 Jan 13.

Bioinformatics for personal genome interpretation

Affiliations
Review

Bioinformatics for personal genome interpretation

Emidio Capriotti et al. Brief Bioinform. 2012 Jul.

Abstract

An international consortium released the first draft sequence of the human genome 10 years ago. Although the analysis of this data has suggested the genetic underpinnings of many diseases, we have not yet been able to fully quantify the relationship between genotype and phenotype. Thus, a major current effort of the scientific community focuses on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds. Many resources aim to identify and annotate the specific genes responsible for the observed phenotypes. Some of these use intra-species genetic variability as a means for better understanding this relationship. In addition, several online resources are now dedicated to collecting single nucleotide variants and other types of variants, and annotating their functional effects and associations with phenotypic traits. This information has enabled researchers to develop bioinformatics tools to analyze the rapidly increasing amount of newly extracted variation data and to predict the effect of uncharacterized variants. In this work, we review the most important developments in the field--the databases and bioinformatics tools that will be of utmost importance in our concerted effort to interpret the human variome.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Growth in the number of genetic variations in dbSNP and SwissVar. RefSNPs shows the number of position-based clusters of variants from dbSNP [2]. Disease and Annotated show the numbers of disease-related and total annotated (either disease-related or neutral) nonsynonymous SNVs from the SwissVar database [3].
Figure 2:
Figure 2:
Distribution of the frequencies of wild-type (A) and mutant (B) residues, difference between the frequencies of wild-type and mutant residues (C) and Conservation Index [151] (D) for disease-related and neutral nsSNVs. Black and white bars show the distributions for disease-related and neutral nonsynonymous variants, respectively, for a set of 54 347 nsSNVs extracted from SwissVar database (October 2009). The data set was composed of 20 089 disease-related and 34 258 neutral mutations from 11 657 proteins. Sequence profiles were calculated from one run of the BLAST algorithm [152] over the UniRef90 database [153] and selecting only sequences with E-values lower than 10−9.

References

    1. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931–45. - PubMed
    1. Sherry ST, Ward MH, Kholodov M, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–11. - PMC - PubMed
    1. Mottaz A, David FP, Veuthey AL, et al. Easy retrieval of single amino-acid polymorphisms and phenotype information using SwissVar. Bioinformatics. 2010;26:851–2. - PMC - PubMed
    1. Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–73. - PMC - PubMed
    1. HapMap Consortium. The International HapMap Project. Nature. 2003;426:789–96. - PubMed

Publication types