Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jan-Feb;20(1):352-359.
doi: 10.1109/TCBB.2022.3146795. Epub 2023 Feb 6.

Expectile Neural Networks for Genetic Data Analysis of Complex Diseases

Expectile Neural Networks for Genetic Data Analysis of Complex Diseases

Jinghang Lin et al. IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb.

Abstract

The genetic etiologies of common diseases are highly complex and heterogeneous. Classic methods, such as linear regression, have successfully identified numerous variants associated with complex diseases. Nonetheless, for most diseases, the identified variants only account for a small proportion of heritability. Challenges remain to discover additional variants contributing to complex diseases. Expectile regression is a generalization of linear regression and provides complete information on the conditional distribution of a phenotype of interest. While expectile regression has many nice properties, it has rarely been used in genetic research. In this paper, we develop an expectile neural network (ENN) method for genetic data analyses of complex diseases. Similar to expectile regression, ENN provides a comprehensive view of relationships between genetic variants and disease phenotypes, which can be used to discover variants predisposing to sub-populations. We further integrate the idea of neural networks into ENN, making it capable of capturing non-linear and non-additive genetic effects (e.g., gene-gene interactions). Through simulations, we showed that the proposed method outperformed an existing expectile regression when there exist complex genotype-phenotype relationships. We also applied the proposed method to the data from the Study of Addiction: Genetics and Environment (SAGE), investigating the relationships of candidate genes with smoking quantity.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
A graphical representation of expectile neural network with one hidden layer
Fig. 2.
Fig. 2.
Performance comparison between ENN and ER under various non-linear relationships between genotypes and phenotypes and different expectiles (i.e., 0.1, 0.25, 0.5, 0.75, and 0.9) ENN: expectile neural network; ER: expectile regression; TR: training; TS: testing
Fig. 3.
Fig. 3.
Performance comparison between ENN and ER for different types of interactions and different expectiles (i.e., 0.1, 0.25, 0.5, 0.75, and 0.9) ENN: expectile neural network; ER: expectile regression; TR: training; TS: testing
Fig. 4.
Fig. 4.
An alternative architecture, a non-fully connected architecture, for gene-gene interaction analysis
Fig. 5.
Fig. 5.
Performance comparison between ENN with a fully connected architecture and ENN with a non-fully connected architecture for gene-gene interaction analysis FUL: ENN with a fully connected architecture; NONFUL: ENN with a non-fully connected architecture;TR: training; TS: testing
Fig. 6.
Fig. 6.
Performance comparison between ENN and QRNN under asymmetric, normal and heteroscedastic settings ENN: expectile neural network; TR: training; TS: testing
Fig. 7.
Fig. 7.
The conditional distribution of smoking quantity for five expectile levels (i.e., 0.1, 0.25, 0.5, 0.75, and 0.9)
Fig. 8.
Fig. 8.
The conditional distribution of CPD considering the interaction between CHRNA5 and CHRNA3
Fig. 9.
Fig. 9.
The conditional distribution of CPD considering the interaction between CHRNA5 and CHRNB4
Fig. 10.
Fig. 10.
The conditional distribution of CPD considering the interaction between CHRNB4 and CHRNA3

References

    1. McClellan J, & King MC (2010). Genetic heterogeneity in human disease. In Cell (Vol. 141, Issue 2). 10.1016/j.cell.2010.03.032. - DOI - PubMed
    1. Marchini J, Donnelly P, & Cardon LR (2005). Genome-wide strategies for detecting multiple loci that influence complex diseases. Nature Genetics, 37(4). 10.1038/ng1537. - DOI - PubMed
    1. Koenker R,& Bassett G. (1978). Regression Quantiles. Econometrica, 46(1). 10.2307/1913643. - DOI
    1. Newey WK, & Powell JL (1987). Asymmetric Least Squares Estimation and Testing. Econometrica, 55(4). 10.2307/1911031. - DOI
    1. Buchinsky M. (1995). Quantile regression, Box-Cox transformation model, and the U.S. wage structure, 1963–1987. Journal of Econometrics, 65(1). 10.1016/0304-4076(94)01599-U. - DOI

Publication types