Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 29;13(1):98.
doi: 10.1186/s13148-021-01083-9.

Estimands in epigenome-wide association studies

Affiliations

Estimands in epigenome-wide association studies

Jochen Kruppa et al. Clin Epigenetics. .

Abstract

Background: In DNA methylation analyses like epigenome-wide association studies, effects in differentially methylated CpG sites are assessed. Two kinds of outcomes can be used for statistical analysis: Beta-values and M-values. M-values follow a normal distribution and help to detect differentially methylated CpG sites. As biological effect measures, differences of M-values are more or less meaningless. Beta-values are of more interest since they can be interpreted directly as differences in percentage of DNA methylation at a given CpG site, but they have poor statistical properties. Different frameworks are proposed for reporting estimands in DNA methylation analysis, relying on Beta-values, M-values, or both.

Results: We present and discuss four possible approaches of achieving estimands in DNA methylation analysis. In addition, we present the usage of M-values or Beta-values in the context of bioinformatical pipelines, which often demand a predefined outcome. We show the dependencies between the differences in M-values to differences in Beta-values in two data simulations: a analysis with and without confounder effect. Without present confounder effects, M-values can be used for the statistical analysis and Beta-values statistics for the reporting. If confounder effects exist, we demonstrate the deviations and correct the effects by the intercept method. Finally, we demonstrate the theoretical problem on two large human genome-wide DNA methylation datasets to verify the results.

Conclusions: The usage of M-values in the analysis of DNA methylation data will produce effect estimates, which cannot be biologically interpreted. The parallel usage of Beta-value statistics ignores possible confounder effects and can therefore not be recommended. Hence, if the differences in Beta-values are the focus of the study, the intercept method is recommendable. Hyper- or hypomethylated CpG sites must then be carefully evaluated. If an exploratory analysis of possible CpG sites is the aim of the study, M-values can be used for inference.

Keywords: DNA methylation; Epigenome-wide association study (EWAS); Estimands; Multiple testing; Reproducible research.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Simulation of the effects estimation influenced by none or two confounder effects. On the y-axis, the percentage deviation from the predefined ΔBeta to estimated Δ^Beta and on the x-axis the raw mean difference of the Beta-values between treatment groups. The first subplot shows the 0% confounder effect. The other two subplot the confounder effects of 10% and 20%. Simulated data with two treatment levels. The deviation is not symmetrical, because the confounder effects were always simulated in the same direction. 5000 simulations with n=1000 each
Fig. 2
Fig. 2
Simulation of the effects estimation with the intercept method and the influence of two confounder effects. On the y-axis the percentage deviation from the predefined ΔBeta to estimated Δ^Beta and on the x-axis the raw mean difference of the Beta-values between treatment groups if we ignoring the confounder effects of 10% and 20%. Simulated data with two treatment levels (5000 simulations with n=1000 each)
Fig. 3
Fig. 3
Histogram of the β-values of the study population of the ArrayExpress data set E-GEOD-68379. This study in particular shows a high number of methylation sites close to 0 and 1, which could be of interest and a problem in modeling
Fig. 4
Fig. 4
3D surface density plot of the distribution of differences in M-values to differences in Beta-values from E-GEOD-55763 (left) and E-GEOD-68379 (right). The difference in M-values (ΔM) is mapped to the corresponding differences in Beta-values (ΔBeta) observed in the data set by comparing two groups of five observations each with random group assignment in 5000 simulations . For ΔM larger than 7, we run 10000 simulations. The small group size of five was chosen for demonstration purposes and is by no means a sufficient group size
Fig. 5
Fig. 5
Mustache plot of the theoretical relation of differences in M-values to differences in Beta-values. On the left side, the difference in M-values (ΔM) is mapped to all possible corresponding differences in Beta-values (ΔBeta). A difference of ΔM=5, for example, can be mapped to a ΔBeta from 0.0009 to 0.6996

Similar articles

Cited by

References

    1. Du P, Zhang X, Huang C-C, Jafari N, Kibbe WA, Hou L, Lin SM. Comparison of beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinform. 2010;11(1):587. doi: 10.1186/1471-2105-11-587. - DOI - PMC - PubMed
    1. Berdasco M, Esteller M. Clinical epigenetics: seizing opportunities for translation. Nat Rev Genet. 2019;20(2):109–127. doi: 10.1038/s41576-018-0074-2. - DOI - PubMed
    1. Herrel A, Joly D, Danchin E. Epigenetics in ecology and evolution. Hoboken: Wiley Online Library; 2020.
    1. Heiss JA, Brennan KJ, Baccarelli AA, Téllez-Rojo MM, Estrada-Gutiérrez G, Wright RO, Just AC. Battle of epigenetic proportions: comparing illumina’s epic methylation microarrays and truseq targeted bisulfite sequencing. Epigenetics. 2020;15(1–2):174–182. doi: 10.1080/15592294.2019.1656159. - DOI - PMC - PubMed
    1. Betensky RA. The p value requires context, not a threshold. Am Stat. 2019;73(sup1):115–117. doi: 10.1080/00031305.2018.1529624. - DOI

Publication types

MeSH terms

LinkOut - more resources