An E-M algorithm and testing strategy for multiple-locus haplotypes
- PMID: 7887436
- PMCID: PMC1801177
An E-M algorithm and testing strategy for multiple-locus haplotypes
Abstract
This paper gives an expectation maximization (EM) algorithm to obtain allele frequencies, haplotype frequencies, and gametic disequilibrium coefficients for multiple-locus systems. It permits high polymorphism and null alleles at all loci. This approach effectively deals with the primary estimation problems associated with such systems; that is, there is not a one-to-one correspondence between phenotypic and genotypic categories, and sample sizes tend to be much smaller than the number of phenotypic categories. The EM method provides maximum-likelihood estimates and therefore allows hypothesis tests using likelihood ratio statistics that have chi 2 distributions with large sample sizes. We also suggest a data resampling approach to estimate test statistic sampling distributions. The resampling approach is more computer intensive, but it is applicable to all sample sizes. A strategy to test hypotheses about aggregate groups of gametic disequilibrium coefficients is recommended. This strategy minimizes the number of necessary hypothesis tests while at the same time describing the structure of disequilibrium. These methods are applied to three unlinked dinucleotide repeat loci in Navajo Indians and to three linked HLA loci in Gila River (Pima) Indians. The likelihood functions of both data sets are shown to be maximized by the EM estimates, and the testing strategy provides a useful description of the structure of gametic disequilibrium. Following these applications, a number of simulation experiments are performed to test how well the likelihood-ratio statistic distributions are approximated by chi 2 distributions. In most circumstances the chi 2 grossly underestimated the probability of type I errors. However, at times they also overestimated the type 1 error probability. Accordingly, we recommended hypothesis tests that use the resampling method.
Similar articles
-
Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm.Heredity (Edinb). 1996 Apr;76 ( Pt 4):377-83. doi: 10.1038/hdy.1996.55. Heredity (Edinb). 1996. PMID: 8626222
-
[The use of the expectation-maximization (EM) algorithm for maximum likelihood estimation of gametic frequencies of multilocus polymorphic codominant systems based on sampled population data].Genetika. 2002 Mar;38(3):407-18. Genetika. 2002. PMID: 11963570 Russian.
-
Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22. Am J Hum Genet. 2000. PMID: 10954684 Free PMC article.
-
Estimation of linkage disequilibrium for loci with multiple alleles: basic approach and an application using data from bighorn sheep.Heredity (Edinb). 2001 Dec;87(Pt 6):698-708. doi: 10.1046/j.1365-2540.2001.00966.x. Heredity (Edinb). 2001. PMID: 11903565
-
Drawing inferences about the coancestry coefficient.Theor Popul Biol. 2009 Jun;75(4):312-9. doi: 10.1016/j.tpb.2009.03.005. Epub 2009 Apr 2. Theor Popul Biol. 2009. PMID: 19345237 Free PMC article. Review.
Cited by
-
Haplotypic structure of the X chromosome in the COGA population sample and the quality of its reconstruction by extant software packages.BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S77. doi: 10.1186/1471-2156-6-S1-S77. BMC Genet. 2005. PMID: 16451691 Free PMC article.
-
APOBEC3G genetic variants and their influence on the progression to AIDS.J Virol. 2004 Oct;78(20):11070-6. doi: 10.1128/JVI.78.20.11070-11076.2004. J Virol. 2004. PMID: 15452227 Free PMC article.
-
Simultaneous estimation of haplotype frequencies and quantitative trait parameters: applications to the test of association between phenotype and diplotype configuration.Genetics. 2004 Sep;168(1):525-39. doi: 10.1534/genetics.104.029751. Genetics. 2004. PMID: 15454562 Free PMC article.
-
Genetic analysis of case/control data using estimated haplotype frequencies: application to APOE locus variation and Alzheimer's disease.Genome Res. 2001 Jan;11(1):143-51. doi: 10.1101/gr.148401. Genome Res. 2001. PMID: 11156623 Free PMC article.
-
Ancestral proportions and their association with skin pigmentation and bone mineral density in Puerto Rican women from New York city.Hum Genet. 2004 Jun;115(1):57-68. doi: 10.1007/s00439-004-1125-7. Epub 2004 Apr 30. Hum Genet. 2004. PMID: 15118905
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials