Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jul 1;40(7):btae423.
doi: 10.1093/bioinformatics/btae423.

mLiftOver: harmonizing data across Infinium DNA methylation platforms

Affiliations

mLiftOver: harmonizing data across Infinium DNA methylation platforms

Brian H Chen et al. Bioinformatics. .

Abstract

Motivation: Infinium DNA methylation BeadChips are widely used for genome-wide DNA methylation profiling at the population scale. Recent updates to probe content and naming conventions in the EPIC version 2 (EPICv2) arrays have complicated integrating new data with previous Infinium array platforms, such as the MethylationEPIC (EPIC) and the HumanMethylation450 (HM450) BeadChip.

Results: We present mLiftOver, a user-friendly tool that harmonizes probe ID, methylation level, and signal intensity data across different Infinium platforms. It manages probe replicates, missing data imputation, and platform-specific bias for accurate data conversion. We validated the tool by applying HM450-based cancer classifiers to EPICv2 cancer data, achieving high accuracy. Additionally, we successfully integrated EPICv2 healthy tissue data with legacy HM450 data for tissue identity analysis and produced consistent copy number profiles in cancer cells.

Availability and implementation: mLiftOver is implemented R and available in the Bioconductor package SeSAMe (version 1.21.13+): https://bioconductor.org/packages/release/bioc/html/sesame.html. Analysis of EPIC and EPICv2 platform-specific bias and high-confidence mapping is available at https://github.com/zhou-lab/InfiniumAnnotationV1/raw/main/Anno/EPICv2/EPICv2ToEPIC_conversion.tsv.gz. The source code is available at https://github.com/zwdzwd/sesame/blob/devel/R/mLiftOver.R under the MIT license.

PubMed Disclaimer

Conflict of interest statement

W.Z. received BeadChips from Illumina Inc. for research.

Figures

Figure 1.
Figure 1.
mLiftOver harmonizes Infinium DNA methylation BeadChip data across array platforms. (A) Schematic illustration of the core features and workflow of mLiftOver from data input to harmonized output. (B) Depiction of the probe naming convention employed in the EPICv2 and MSA arrays. (C) The accuracy of mLiftOver was evaluated using the GM12878 cell line data, contrasting measurements from EPICv1 and EPICv2. The panel is divided into three sub-panels, demonstrating (i) direct probe ID translation, (ii) signal averaging across replicates, and (iii) imputation of missing probe readings (excluding those with methylation level standard deviation >0.08). Spearman’s correlation coefficients are displayed atop each subpanel, with all correlations being significant (P-value <1E-6). (D) Removal of platform-specific biases (tested on a pair of HCT116 cell line data that did not participate in the platform-specific bias analysis), P-value <1E-6. (E) Illustrates the integration process of mLiftOver for primary healthy tissue data and TCGA tumor-adjacent normal tissue data, showcasing its utility in harmonizing diverse datasets for tissue classification. (F) Demonstrates the application of cancer classification models, initially trained on HM450 data using a random forest framework, to primary tumor datasets harmonized from EPICv2 data through mLiftOver. (G) Plot relating the number of missing probes and prediction error of Horvath’s pan-tissue clock, stratified by sex. (H) Compares copy number variation profiles obtained from native EPIC data and profiles harmonized from EPICv2 data, showing the consistency of mLiftOver in signal data conversion.

Similar articles

Cited by

References

    1. Aref-Eshghi E, et al.Evaluation of DNA methylation episignatures for diagnosis and phenotype correlations in 42 mendelian neurodevelopmental disorders. Am. J. Hum. Genet. 2020;106:356–70. - PMC - PubMed
    1. Arneson A, Haghani A, Thompson MJ. et al. A mammalian methylation array for profiling methylation levels at conserved sequences. Nat Commun 2022;13:783. - PMC - PubMed
    1. Battram T, Yousefi P, Crawford G. et al. The EWAS catalog: a database of epigenome-wide association studies. Wellcome Open Res 2022;7:41. - PMC - PubMed
    1. Bibikova M, et al.High-throughput DNA methylation profiling using universal bead arrays. Genome Res. 2006;16:383–93. - PMC - PubMed
    1. Bibikova M, Le J, Barnes B. et al. Genome-wide DNA methylation profiling using Infinium® assay. Epigenomics 2009;1:177–200. - PubMed

Publication types