Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Mar 10;106(10):3871-6.
doi: 10.1073/pnas.0812824106. Epub 2009 Feb 6.

Power of deep, all-exon resequencing for discovery of human trait genes

Affiliations

Power of deep, all-exon resequencing for discovery of human trait genes

Gregory V Kryukov et al. Proc Natl Acad Sci U S A. .

Abstract

The ability to sequence cost-effectively all of the coding regions of a given individual genome is rapidly approaching, with the potential for whole-genome resequencing not far behind. Initiatives are currently underway to phenotype hundreds of thousands of individuals for major human traits. Here, we determine the power for de novo discovery of genes related to human traits by resequencing all human exons in a clinical population. We analyze the potential of the gene discovery strategy that combines multiple rare variants from the same gene and treats genes, rather than individual alleles, as the units for the association test. By using computer simulations based on deep resequencing data for the European population, we show that genes meaningfully affecting a human trait can be identified in an unbiased fashion, although large sample sizes would be required to achieve substantial power.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig. 1.
Fig. 1.
Simulated resequencing study. (A) Design of simulated resequencing study. Color marks represent various new mutations discovered in sequenced individuals. (B) Modeling the effect of mutations effect on phenotype. Distribution of quantitative trait for noncarriers is shown in gray. Distribution of the same quantitative trait for individuals that carry at least one moderately deleterious mutation is shown in red. Observed distribution of QT in the whole population is the sum of these distributions. Individuals from the phenotypic extremes marked by blue are subjects for resequencing.
Fig. 2.
Fig. 2.
Two-dimensional sections of the likelihood surface for the demographic model that was fitted to the systematic resequencing data. Population history model, long-term constant population size is followed by a bottleneck and subsequent exponential population growth. The model has four parameters and limited to the European population: N1, ancestral population size; Nb, bottleneck population size; Nf, final population size; π, time of the population expansion since the bottleneck.
Fig. 3.
Fig. 3.
Agreement of experimental allele frequency spectra with the modeled spectra (A) neutral SNPs and (B) missense SNPs.
Fig. 4.
Fig. 4.
Distribution of selection coefficients for de novo missense mutations. Distribution was modeled by gamma distribution and fitted to deep resequencing data by the maximum likelihood method. Mutations with selection coefficient < 10−4 were assumed to have no effect on quantitative phenotype in our model and are shown in green. Mutations assumed to be functional are shown in orange.
Fig. 5.
Fig. 5.
Excess of missense variants at one of the phenotypic extremes. Data are shown for simulation of 5,000 individuals sequenced at each of two 5% phenotypic extremes. Sums of the alleles (cumulative frequencies of pooled variants) were averaged over 10,000 simulations. Shift of quantitative trait median in mutation carriers is assumed to be equal to 0.5σ.

References

    1. Couzin J, Kaiser J. Genome-wide association. Closing the net on common disease genes. Science. 2007;316:820–822. - PubMed
    1. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007;447:661–678. - PMC - PubMed
    1. Frayling TM. Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nat Rev Genet. 2007;8:657–662. - PubMed
    1. Cohen JC, et al. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. - PubMed
    1. Kotowski IK, et al. A spectrum of PCSK9 alleles contributes to plasma levels of low-density lipoprotein cholesterol. Am J Hum Genet. 2006;78:410–422. - PMC - PubMed

Publication types

Substances