Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 May;82(5):1064-74.
doi: 10.1016/j.ajhg.2008.03.002. Epub 2008 Apr 24.

Estimating odds ratios in genome scans: an approximate conditional likelihood approach

Affiliations

Estimating odds ratios in genome scans: an approximate conditional likelihood approach

Arpita Ghosh et al. Am J Hum Genet. 2008 May.

Erratum in

  • Am J Hum Genet. 2008 May;82(5):1224

Abstract

In modern whole-genome scans, the use of stringent thresholds to control the genome-wide testing error distorts the estimation process, producing estimated effect sizes that may be on average far greater in magnitude than the true effect sizes. We introduce a method, based on the estimate of genetic effect and its standard error as reported by standard statistical software, to correct for this bias in case-control association studies. Our approach is widely applicable, is far easier to implement than competing approaches, and may often be applied to published studies without access to the original data. We evaluate the performance of our approach via extensive simulations for a range of genetic models, minor allele frequencies, and genetic effect sizes. Compared to the naive estimation procedure, our approach reduces the bias and the mean squared error, especially for modest effect sizes. We also develop a principled method to construct confidence intervals for the genetic effect that acknowledges the conditioning on statistical significance. Our approach is described in the specific context of odds ratios and logistic modeling but is more widely applicable. Application to recently published data sets demonstrates the relevance of our approach to modern genome scans.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Behavior of the Unconditional and Conditional Likelihoods for μ Unconditional and conditional likelihoods of μ are presented for (A) z = 5.2, (B) z = 5.33, and (C) z = 6. The location of the observed z is indicated by a black dot on each plot. The conditional likelihood changes considerably for small changes in z near c. For larger z, the conditional likelihood approaches the unconditional likelihood. Likelihoods for μ<c are negligible and not shown. (D) shows conditional densities of z for μ = 0.66 and μ=5.2, with the relative likelihoods highlighted for a fixed value z = 5.2.
Figure 2
Figure 2
Estimators and Confidence Intervals for μ with Significance Threshold c= 5 (A) The expectation of naive estimator μ^ shows substantial bias and (B) very large mean squared error for much of the range of μ, whereas the corrected estimators have lower bias and MSE (C) shows upper and lower confidence bounds for μ as a function of the observed statistic z.
Figure 3
Figure 3
Expectations and Mean Squared Errors for the Three Genetic Models under MAF = 0.25 The corrected estimators show greatly improved performance for much of the range of β. The left panels correspond to the recessive model, the middle panels correspond to the additive model, and the right panels correspond to the dominant model. The top row shows expected values for the naive and conditional likelihood estimators versus β. The bottom row shows mean squared errors for the estimators. The y axes are rescaled to highlight details—the MSE is considerably larger for the recessive model because of scarcity of the risk homozygotes.
Figure 4
Figure 4
Mean Square Errors of the Estimators versus β for MAF Values Ranging from 0.05 to 0.5 The additive model is assumed, with n = 1000. The MSEs drop for larger MAF, but the relative performance of the estimators is maintained.
Figure 5
Figure 5
Estimates of the CI Coverage Probability Plotted against β for the Three Genetic Models, MAF = 0.25 Black dots correspond to 95% CIs; gray dots correspond to 90% CIs. The dashed curves represent coverage of standard 95% CIs that do not acknowledge the significance selection. The top row shows n = 1000 (500 cases and 500 controls). The bottom row shows n = 2000 (1000 cases and 1000 controls). Coverage is close to nominal, except for regions of overcoverage in the recessive model because of small cell counts (note that the y axis range begins at 0.7). For all models, the coverage will approach the nominal value as the sample size increases further.

References

    1. Lander E., Kruglyak L. Genetic dissection of complex traits: Guidelines for interpreting and reporting linkage results. Nat. Genet. 1995;11:241–247. - PubMed
    1. Zondervan K.T., Cardon L.R. Designing candidate gene and genome-wide case-control association studies. Nat. Protocols. 2007;2:2492–2501. - PMC - PubMed
    1. Todd J.A., Walker N.M., Cooper J.D., Smyth D.J., Downes K., Plagnol V., Bailey R., Nejentsev S., Field S.F., Payne F. Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes. Nat. Genet. 2007;39:857–864. - PMC - PubMed
    1. Scott L.J., Mohlke K.L., Bonnycastle L.L., Willer C.J., Li Y., Duren W.L., Erdos M.R., Stringham H.M., Chines P.S., Jackson A.U. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science. 2007;316:1341–1345. - PMC - PubMed
    1. Risch N., Merikangas K. The future of genetic studies of complex human diseases. Science. 1996;273:1516–1517. - PubMed

Publication types

LinkOut - more resources