Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Dec 1;45(6):1961-1974.
doi: 10.1093/ije/dyw220.

Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic

Affiliations

Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic

Jack Bowden et al. Int J Epidemiol. .

Abstract

Background: : MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied.

Methods: An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example.

Results: In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We demonstrate our proposed approach for a two-sample summary data MR analysis to estimate the causal effect of low-density lipoprotein on heart disease risk. A high value of IGX2 close to 1 indicates that dilution does not materially affect the standard MR-Egger analyses for these data.

Conclusions: : Care must be taken to assess the NOME assumption via the IGX2 statistic before implementing standard MR-Egger regression in the two-sample summary data context. If IGX2 is sufficiently low (less than 90%), inferences from the method should be interpreted with caution and adjustment methods considered.

Keywords: I2 statistic; MR-Egger regression; Mendelian randomization; measurement error; simulation extrapolation.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Illustrative diagram showing the SNP-exposure associations (estimates = hollow black dots, true values = solid black dots) plotted against the SNP-outcome association estimates for a fictional MR analysis.
Figure 2.
Figure 2.
Illustrative diagram showing the SNP-outcome association estimates plotted against both the SNP-exposure association estimates (hollow black dots) and their true values (solid black dots). Top left: positive causal effect, balanced pleiotropy. Top right: positive causal effect, negative directional pleiotropy. Bottom left: positive causal effect, positive directional pleiotropy. Bottom right: no causal effect, positive directional pleiotropy.
Figure 3.
Figure 3.
Left: distribution of IGX2 estimates under scenario 1 for F¯ = 20 and IGX2 = 0.60 when L = 25 (blue), 50 (red) and 100 (black). Right: distribution of IGX2 estimates under scenario 1 for F¯ = 125 and IGX2 = 0.95 when L = 25 (blue), 50 (red) and 100 (black).
Figure 4.
Figure 4.
Left: scatter plot of the summary data estimates, with IVW and MR-Egger slope estimates shown. Right: funnel plot of the causal effect estimates, with overall estimates under the IVW and MR-Egger approaches (with and without SIMEX correction).
Figure 5.
Figure 5.
Simulation extrapolation applied to the MR-Egger regression analysis of the lipids data. The adjusted estimate is that predicted by the model at the value λ = -1.

References

    1. Davey Smith G, Ebrahim S.. `Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? Int J Epidemiol 2003;32:1–22. - PubMed
    1. CARDIoGRAMplusC4D. Large-scale association analysis identifies new risk loci for coronary artery disease. Nat Genet 2013;45:25–33. - PMC - PubMed
    1. Global Lipids Genetics Consortium. Discovery and refinement of loci associated with lipid levels. Nat Genet 2013;45:1274–83. - PMC - PubMed
    1. Pierce B, Burgess S.. Efficient design for Mendelian randomization studies: subsample and two-sample instrumental variable estimators. Am J Epidemiol 2013;178:1177–84. - PMC - PubMed
    1. Burgess S, Butterworth A, Thompson S.. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol 2013;37:658–65. - PMC - PubMed

Publication types