Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 5;8(1):874.
doi: 10.1038/s42003-025-08157-x.

Detecting genetic interactions with visible neural networks

Affiliations

Detecting genetic interactions with visible neural networks

Arno van Hilten et al. Commun Biol. .

Abstract

Non-linear interactions among single nucleotide polymorphisms (SNPs), genes, and pathways play an important role in human diseases, but identifying these interactions is a challenging task. Neural networks are state-of-the-art predictors in many domains due to their ability to analyze big data and model complex patterns, including non-linear interactions. In genetics, visible neural networks are popular as they provide insight into the most important SNPs, genes, and pathways for prediction. Visible neural networks use prior knowledge (e.g., gene and pathway annotations) to define node connections in the network, making them sparse and interpretable. Currently, most of these networks provide measures for the importance of SNPs, genes, and pathways but do not provide information about interactions. In this paper, we explore different methods to detect non-linear interactions with visible neural networks. We adapt and speed up existing methods, create a comprehensive benchmark with simulated data from GAMETES and EpiGEN, and demonstrate that these methods can extract multiple types of interactions from trained neural networks. Finally, we apply these methods to a genome-wide case-control study of inflammatory bowel disease and find high consistency of the epistasis pairs candidates between interpretation methods. The follow-up association test on these candidates identifies seven significant epistasis pairs.

PubMed Disclaimer

Conflict of interest statement

Competing interests: Wiro Niessen is co-founder and shareholder of Quantib BV. Other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. An overview of the post-hoc interpretation methods applied in this study to detect interactions in visible neural networks.
a Comparing the relative weights of the one-hot encoded input for each SNP reveals the model that the neural network is using for that particular SNP (e.g., linear spaced weights indicate an additive model). PathExplain applies Integrated Gradients on itself to find the Expected Hessians, which can be used to find interaction between inputs. RLIPP (c) is a method to detect if a node has non-linear behavior. The activations towards and from this neuron are regressed to the output with linear regression to provide an estimate of the non-linear gain of that node. d NID uses the assumption that edges with strong weights are more likely to interact with each other than edges with low absolute weights. DFIM (e) compares Deeplift’s attribution scores for all features before and after a feature of interest is perturbed, revealing all features that interact with the feature of interest.
Fig. 2
Fig. 2. GAMETES.
In (a) the prAUC with the confidence interval of of the various epistasis interpretation methods. In (b) the average of the prAUC for methods for different thresholds of prediction AUC in the test set. There is a clear trend showing better prAUC given better prediction AUC. In (c) the correlation plot shows the correlation between the prAUC of various methods and the prediction AUC of the NN and LGBM (AUC NN; AUC NN OneHot; AUC LGBM).
Fig. 3
Fig. 3. EPIGEN.
In (a) the mean prAUC of the various methods are compared, with the confidence interval displayed. In (b) the mean prAUC of each method is displayed per type of interaction. In (c) each dot is the average of the prAUC for methods that have a prediction AUC equal or greater than the number on the x-axis.
Fig. 4
Fig. 4. UpSet plot showing the intersections of our eight interpretation approaches (7 Epistasis methods: NID; DFIM; Pathfinder with/without the one hot module, and LGBM’s feature interaction measure; plus LGBM feature importance) with the known variants from DisGeNet for IBD and Crohn’s disease.
Each standing bar shows the number of overlapping pairs between the highlighted method(s). In (a) For each approach, the top-100 SNPs with the highest importance score were evaluated. The horizontal bar represents the number of SNPs included in each analysis, whereas the vertical bars show the overlap between each analysis; In (b) the top-100 SNPs were mapped to gene positionally (as explained in the method section), and the intersection is showed. Finally, in (c) the shared genes between at least one approach and one DisGeNet list are highlighted.

Similar articles

Cited by

References

    1. Litjens, G. et al. A survey on deep learning in medical image analysis. Med. image Anal.42, 60–88 (2017). - PubMed
    1. Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems30 (2017).
    1. Young, T., Hazarika, D., Poria, S. & Cambria, E. Recent trends in deep learning based natural language processing. IEE E Comput. Intell. Mag.13, 55–75 (2018).
    1. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature596, 583–589 (2021). - PMC - PubMed
    1. Lu, Z., Pu, H., Wang, F., Hu, Z. & Wang, L. The expressive power of neural networks: A view from the width. Adv. Neural Info. Proc. Syst.30 (2017).

LinkOut - more resources