Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 23;52(17):10144-10160.
doi: 10.1093/nar/gkae697.

Network medicine-based epistasis detection in complex diseases: ready for quantum computing

Affiliations

Network medicine-based epistasis detection in complex diseases: ready for quantum computing

Markus Hoffmann et al. Nucleic Acids Res. .

Abstract

Most heritable diseases are polygenic. To comprehend the underlying genetic architecture, it is crucial to discover the clinically relevant epistatic interactions (EIs) between genomic single nucleotide polymorphisms (SNPs) (1-3). Existing statistical computational methods for EI detection are mostly limited to pairs of SNPs due to the combinatorial explosion of higher-order EIs. With NeEDL (network-based epistasis detection via local search), we leverage network medicine to inform the selection of EIs that are an order of magnitude more statistically significant compared to existing tools and consist, on average, of five SNPs. We further show that this computationally demanding task can be substantially accelerated once quantum computing hardware becomes available. We apply NeEDL to eight different diseases and discover genes (affected by EIs of SNPs) that are partly known to affect the disease, additionally, these results are reproducible across independent cohorts. EIs for these eight diseases can be interactively explored in the Epistasis Disease Atlas (https://epistasis-disease-atlas.com). In summary, NeEDL demonstrates the potential of seamlessly integrated quantum computing techniques to accelerate biomedical research. Our network medicine approach detects higher-order EIs with unprecedented statistical and biological evidence, yielding unique insights into polygenic diseases and providing a basis for the development of improved risk scores and combination therapies.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
Overview of the methodology. (A) SNPs are mapped to genes via SNP-gene links obtained from dbSNP (7) and are then connected in an SNP-SNP interaction (SSI) network if they are mapped to the same protein or to adjacent proteins in a protein-protein interaction (PPI) network (i.e., directly when the proteins interact with each other or indirectly connected when the SNPs are linked through other proteins). (B) Next, NeEDL picks either random seeds or seeds selected by a quantum-computing optimization algorithm. It then uses local search (i.e., either adding new SNPs from the direct neighborhood in the SSI network or removing SNPs that worsen the score) to find SNP sets of a user-specified maximum size that induce connected subgraphs in the SSI network and are locally optimal w. r. t. statistical association of the higher-order genotype with the investigated phenotypic trait. To quantify association strength, NeEDL implements various statistical epistasis models suggested in the literature (8). Since local search can get stuck due to the requirement of constant improvement in each step, we use simulated annealing, which allows us to accept a less significant intermediate solution with decreasing probability over time.
Figure 2.
Figure 2.
Quantitative evaluation of the SNP sets computed by NeEDL for Late-Onset Alzheimer’s Disease, Bipolar Disorder, Diabetes Type 2, and Rheumatoid Arthritis. Results for Coronary Artery Disease, Diabetes Type 1, Hypertension, and Inflammatory Bowel Disease can be found in Supplementary Figure S7. (A) Visualization of the maximum likelihood model score and the K2 score of the top 100 candidate SNP sets ranked by NeEDL. (B) A benchmark study between NeEDL, LinDen, MACOED, a second-order baseline (i.e. 1000 random sampled pairs of SNPs), and a higher-order baseline (i.e. 1000 randomly sampled sets consisting of multiple SNPs) shows that NeEDL outperforms in statistical significance existing epistasis detection tools w. r. t. four different evaluation metrics. (C) Analyzing the number of SNPs included in NeEDL’s output SNP sets reveals that the most promising SNP sets typically contain between three and seven SNPs. (D) Comparing maximum likelihood model scores of NeEDL results against those obtained using randomized networks demonstrates that the use of the SSI network indeed leads to the discovery of more promising SNP sets.
Figure 3.
Figure 3.
Evaluation of the SNP sets computed by NeEDL and LinDen with shuffled phenotypes for Bipolar Disorder, Diabetes Type 2, and Rheumatoid Arthritis. (A) Maximum likelihood model and K2 scores for the top 100 SNP sets reported by NeEDL (ranked by the maximum likelihood model score). (B) Comparison of the scores of the top 50 results of NeEDL and LinDen and the second- and higher-order baseline. As a reference, the scores of the original NeEDL runs without phenotype shuffling are also shown (yellow box plots). (C) SNP set size distribution of NeEDL’s results. (D) Maximum likelihood model scores obtained by NeEDL with and without network rewiring and phenotype shuffling.
Figure 4.
Figure 4.
Replication studies in two independent data sets from UK Biobank with British and mixed ancestry. (A) P-values across discovery and the two replication studies. (B) Correlation of the MLM score between the discovery study and the two replication studies. (C) Replication of the benchmarking between NeEDL, LinDen, MACOED, a second-order baseline and a higher-order baseline in two replication data sets.
Figure 5.
Figure 5.
Experiments on the quantum computers: Expected scaling of the quantum hardware. The different quantum and classical devices used to perform the optimization process show different scaling, with the quantum devices (Quantum Annealing, QAOA) having a higher slope than classical devices (thermal annealing, parallel tempering, and Gurobi), suggesting a possible scaling advantage.

Update of

  • Network medicine-based epistasis detection in complex diseases: ready for quantum computing.
    Hoffmann M, Poschenrieder JM, Incudini M, Baier S, Fitz A, Maier A, Hartung M, Hoffmann C, Trummer N, Adamowicz K, Picciani M, Scheibling E, Harl MV, Lesch I, Frey H, Kayser S, Wissenberg P, Schwartz L, Hafner L, Acharya A, Hackl L, Grabert G, Lee SG, Cho G, Cloward M, Jankowski J, Lee HK, Tsoy O, Wenke N, Pedersen AG, Bønnelykke K, Mandarino A, Melograna F, Schulz L, Climente-González H, Wilhelm M, Iapichino L, Wienbrandt L, Ellinghaus D, Van Steen K, Grossi M, Furth PA, Hennighausen L, Di Pierro A, Baumbach J, Kacprowski T, List M, Blumenthal DB. Hoffmann M, et al. medRxiv [Preprint]. 2023 Nov 9:2023.11.07.23298205. doi: 10.1101/2023.11.07.23298205. medRxiv. 2023. Update in: Nucleic Acids Res. 2024 Sep 23;52(17):10144-10160. doi: 10.1093/nar/gkae697. PMID: 38076997 Free PMC article. Updated. Preprint.

References

    1. Heap G.A., Trynka G., Jansen R.C., Bruinenberg M., Swertz M.A., Dinesen L.C., Hunt K.A., Wijmenga C., Vanheel D.A., Franke L.. Complex nature of SNP genotype effects on gene expression in primary human leucocytes. BMC Med. Genom. 2009; 2:1. - PMC - PubMed
    1. Bush W.S., Moore J.H.. Chapter 11: Genome-wide association studies. PLoS Comput. Biol. 2012; 8:e1002822. - PMC - PubMed
    1. MacArthur J., Bowler E., Cerezo M., Gil L., Hall P., Hastings E., Junkins H., McMahon A., Milano A., Morales J.et al. .. The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog). Nucleic Acids Res. 2017; 45:D896–D901. - PMC - PubMed
    1. Gibson G. Rare and common variants: twenty arguments. Nat. Rev. Genet. 2012; 13:135–145. - PMC - PubMed
    1. Lippert C., Listgarten J., Davidson R.I., Baxter S., Poon H., Kadie C.M., Heckerman D.. An exhaustive epistatic SNP association analysis on expanded Wellcome Trust data. Sci. Rep. 2013; 3:1099. - PMC - PubMed