Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr 4;92(4):504-16.
doi: 10.1016/j.ajhg.2013.02.011.

GIGI: an approach to effective imputation of dense genotypes on large pedigrees

Affiliations

GIGI: an approach to effective imputation of dense genotypes on large pedigrees

Charles Y K Cheung et al. Am J Hum Genet. .

Abstract

Recent emergence of the common-disease-rare-variant hypothesis has renewed interest in the use of large pedigrees for identifying rare causal variants. Genotyping with modern sequencing platforms is increasingly common in the search for such variants but remains expensive and often is limited to only a few subjects per pedigree. In population-based samples, genotype imputation is widely used so that additional genotyping is not needed. We now introduce an analogous approach that enables computationally efficient imputation in large pedigrees. Our approach samples inheritance vectors (IVs) from a Markov Chain Monte Carlo sampler by conditioning on genotypes from a sparse set of framework markers. Missing genotypes are probabilistically inferred from these IVs along with observed dense genotypes that are available on a subset of subjects. We implemented our approach in the Genotype Imputation Given Inheritance (GIGI) program and evaluated the approach on both simulated and real large pedigrees. With a real pedigree, we also compared imputed results obtained from this approach with those from the population-based imputation program BEAGLE. We demonstrated that our pedigree-based approach imputes many alleles with high accuracy. It is much more accurate for calling rare alleles than is population-based imputation and does not require an outside reference sample. We also evaluated the effect of varying other parameters, including the marker type and density of the framework panel, threshold for calling genotypes, and population allele frequencies. By leveraging information from existing genotypes already assayed on large pedigrees, our approach can facilitate cost-effective use of sequence data in the pursuit of rare causal variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Call Rate across Classes of Subjects in the Simulated Pedigree of 52 Subjects (A) Different designs of subjects observed for genotypes are indicated by different shading schemes: all subjects (shaded or not shaded); many subjects (any shaded); and few subjects (black shaded). Classes of subjects are indicated by letters. (B) We used the SfMm8 framework panel from simulation. Classes of subjects are as in (A).
Figure 2
Figure 2
Different Subjects Have Different Levels of Genotypes Some subjects (n1 of them) had observed genotypes for both framework markers (top ticks) and dense markers (bottom ticks); n2 of the subjects had observed genotypes for framework markers but had missing genotypes (symbol ?) for dense markers; n3 of the subjects were completely unobserved for both framework and dense markers.
Figure 3
Figure 3
Call Rate and Accuracy as a Function of Call Threshold in Simulated and Real Pedigree Call rate is indicated by circle and accuracy is indicated by a plus sign. (A) Analysis of simulated data: we used the SfMm8 framework panel. (B) Analysis of real data: see text for the description of the analysis.
Figure 4
Figure 4
Call Rate and Accuracy as a Function of the True Minor Allele Frequency We used the SfMm8 framework panel from simulation. Different call thresholds were used: (A) t1 = 0.8, t2 = 0.9 and (B) practically deterministic (t11.0).
Figure 5
Figure 5
Impact of Distance from the Nearest Framework Marker We used the Mm8 framework panel from simulation. We measured the (A) accuracy, (B) call rate, and (C) consistency.

Similar articles

Cited by

References

    1. Amberger J., Bocchini C., Hamosh A. A new face and new challenges for Online Mendelian Inheritance in Man (OMIM®) Hum. Mutat. 2011;32:564–567. - PubMed
    1. Collins F.S., Guyer M.S., Charkravarti A. Variations on a theme: cataloging human DNA sequence variation. Science. 1997;278:1580–1581. - PubMed
    1. Manolio T.A., Brooks L.D., Collins F.S. A HapMap harvest of insights into the genetics of common disease. J. Clin. Invest. 2008;118:1590–1605. - PMC - PubMed
    1. Manolio T.A., Collins F.S., Cox N.J., Goldstein D.B., Hindorff L.A., Hunter D.J., McCarthy M.I., Ramos E.M., Cardon L.R., Chakravarti A. Finding the missing heritability of complex diseases. Nature. 2009;461:747–753. - PMC - PubMed
    1. Cohen J.C., Kiss R.S., Pertsemlidis A., Marcel Y.L., McPherson R., Hobbs H.H. Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004;305:869–872. - PubMed

Publication types

LinkOut - more resources