Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2012:63:35-61.
doi: 10.1146/annurev-med-051010-162644.

Human genome sequencing in health and disease

Affiliations
Review

Human genome sequencing in health and disease

Claudia Gonzaga-Jauregui et al. Annu Rev Med. 2012.

Abstract

Following the "finished," euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Comparison of single nucleotide polymorphisms (SNPs) in 10 personal genomes. All SNPs in any of 10 sequenced personal genomes were compared with the other 9 genomes. Altogether, the 10 genomes contribute 14,608,404 nonredundant SNPs (first bar). The second bar pictures all SNPs that are unique to each of the personal genomes; the third bar represents all the SNPs that are unique in a given personal genome but also novel; the fourth bar shows the SNPs shared by individuals of the same ethnic group. Abbreviations: AF1, NA18507(1) Illumina; AF2, NA18507(2) SOLiD; KB1, Khoisan genome; ABT, Archbishop Desmond Tutu; YH, Chinese genome; SJK, Korean genome 1; AK1, Korean genome 2; JCV, J. Craig Venter; JDW, James D. Watson; JRL, James R. Lupski.
Figure 2
Figure 2
Size distribution of large indels (100 bp–1 kb) and copy-number variants (CNVs) (>1 kb) in sequenced personal human genomes. Distribution of large indels and CNVs in 8 personal genomes is shown by size. We can observe peaks between 300 and 400 bp, consistent with Alu indel polymorphisms, and at ~1–2 kb. Few polymorphic CNVs are larger than 200 kb. Abbreviations: AF1, NA18507(1) Illumina; AF2, NA18507(2) SOLiD; KB1, Khoisan genome; ABT, Archbishop Desmond Tutu; YH, Chinese genome; SJK, Korean genome 1; AK1, Korean genome 2; JCV, J. Craig Venter; JDW, James D. Watson; JRL, James R. Lupski.
Figure 3
Figure 3
A comparison of the weaknesses and strengths of whole-genome sequencing (WGS) and exome sequencing approaches for disease-gene identification. Abbreviations: CNVs, copy-number variants; SNVs, simple nucleotide variants.
Figure 4
Figure 4
Schematic workflow of whole-genome/exome sequencing data analysis. After sequencing, the sequence reads are mapped and aligned against the human reference genome assembly in order to obtain a list of variants at every position that does not match the reference. Quality filters are applied to obtain high-quality variant calls. Various filtering criteria are applied to prioritize the candidate variants. Most variants will be excluded because they are known, meaning that they are already in variation databases, such as the database of single nucleotide polymorphisms (dbSNP), The 1000 Genomes Project database, etc. The focus is mainly on novel variants, which can be tiered in functional classes according to their annotation. For coding variants, priority is given to nonsense, frameshifting, splice-site, and then missense mutations. Computational prediction of the functional impact of these variants can also help prioritize candidate mutations. Based on the characteristics of the trait or disease of interest, variants can be examined under a dominant or recessive model. Additional confirmation through other resources can strengthen the hypotheses of the functional significance of identified variants. Genetic and functional confirmation of the candidate disease-causing variants is the final, most important step.

References

    1. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. - PubMed
    1. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004;431:931–45. - PubMed
    1. Bailey JA, Yavor AM, Massa HF, et al. Segmental duplications: organization and impact within the current Human Genome Project assembly. Genome Res. 2001;11:1005–17. - PMC - PubMed
    1. Lupski JR. Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits. Trends Genet. 1998;14:417–22. - PubMed
    1. The International HapMap Consortium. The International HapMap Project. Nature. 2003;426:789–96. - PubMed

Publication types