Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 Aug;14(8):1624-32.
doi: 10.1101/gr.2204604. Epub 2004 Jul 15.

Haplotype and missing data inference in nuclear families

Affiliations

Haplotype and missing data inference in nuclear families

Shin Lin et al. Genome Res. 2004 Aug.

Abstract

Determining linkage phase from population samples with statistical methods is accurate only within regions of high linkage disequilibrium (LD). Yet, affected individuals in a genetic mapping study, including those involving cases and controls, may share sequences identical-by-descent stretching on the order of 10s to 100s of kilobases, quite possibly over regions of low LD in the population. At the same time, inferring phase from nuclear families may be hampered by missing family members, missing genotypes, and the noninformativity of certain genotype patterns. In this study, we reformulate our previous haplotype reconstruction algorithm, and its associated computer program, to phase parents with information derived from population samples as well as from their offspring. In applications of our algorithm to 100-kb stretches, simulated in accordance to a Wright-Fisher model with typical levels of LD in humans, we find that phase reconstruction for 160 trios with 10% missing data is highly accurate (>90%) over the entire length. Furthermore, our algorithm can estimate allelic status for missing data at high accuracy (>95%). Finally, the input capacity of the program is vast, easily handling thousands of segregating sites in > or = 1000 chromosomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pairwise accuracy of phase relation calls versus intervening length in families with and without children. One hundred simulations of the eight X-linked loci were performed, and the binary accuracies of all pairwise phase relations were tabulated along with their distances. A logistic regression was performed in which accuracy was regressed on dichotomous variables corresponding to the placement of lengths in bins of 5 kb. Predicted values with corresponding 95% confidence intervals are plotted for phase relations whose intervening lengths were <100 kb.
Figure 2
Figure 2
Pairwise accuracy of phase relation calls versus intervening length in families with both parents using simulated haplotypes. Results derived from 50 simulations of 40 haplotypes randomly paired twice to form 10 families (a, d); 640 haplotypes to form 160 families (b, e); and 1280 haplotypes to form 320 families (c, f). (ac) All sites; (df) sites with minor allele frequency >0.1. The points for all graphs were calculated by the same procedure as in Figure 1 but without 95% confidence intervals.
Figure 3
Figure 3
Pairwise accuracy of phase relation calls versus intervening length in families in which fathers' genotypes are missing data. Results derived from 50 simulations of 40 haplotypes randomly paired twice to form 10 families (a, d); 640 haplotypes to from 160 families (b, e); and 1280 haplotypes to form 320 families (c, f). (ac) All sites; (df), sites with minor allele frequency >0.1. The points for all graphs were calculated by the same procedure as in Figure 1 but without 95% confidence intervals.

Similar articles

Cited by

References

    1. Akey, J., Jin, L., and Xiong, M. 2001. Haplotypes vs. single marker linkage disequilibrium tests: What do we gain? Eur. J. Hum. Genet. 9: 291–300. - PubMed
    1. Becker, T. and Knapp, M. 2002. Efficiency of haplotype frequency estimation when nuclear family information is included. Hum. Hered. 54: 45–53. - PubMed
    1. Chapman, J.M., Cooper, J.D., Todd, J.A., and Clayton, D.G. 2003. Detecting disease associations due to linkage disequilibrium using haplotype tags: A class of tests and the determinants of statistical power. Hum. Hered. 56: 18–31. - PubMed
    1. Cheng, R., Ma, J.Z., Wright, F.A., Lin, S., Gao, X., Wang, D., Elston, R.C., and Li, M.D. 2003. Nonparametric disequilibrium mapping of functional sites using haplotypes of multiple tightly linked single-nucleotide polymorphism markers. Genetics 164: 1175–1187. - PMC - PubMed
    1. Clark, A.G. 1990. Inference of haplotypes from PCR-amplified samples of diploid populations. Mol. Biol. Evol. 7: 111–122. - PubMed

WEB SITE REFERENCES

    1. http://www.bioinf.mdc-berlin.de/∼rob/; The Rohde-Fuerst haplotyping program.
    1. http://archimedes.well.ox.ac.uk/pise; PHamily.

Publication types