Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 16:12:652907.
doi: 10.3389/fgene.2021.652907. eCollection 2021.

Graph Representation Forecasting of Patient's Medical Conditions: Toward a Digital Twin

Affiliations

Graph Representation Forecasting of Patient's Medical Conditions: Toward a Digital Twin

Pietro Barbiero et al. Front Genet. .

Abstract

Objective: Modern medicine needs to shift from a wait and react, curative discipline to a preventative, interdisciplinary science aiming at providing personalized, systemic, and precise treatment plans to patients. To this purpose, we propose a "digital twin" of patients modeling the human body as a whole and providing a panoramic view over individuals' conditions. Methods: We propose a general framework that composes advanced artificial intelligence (AI) approaches and integrates mathematical modeling in order to provide a panoramic view over current and future pathophysiological conditions. Our modular architecture is based on a graph neural network (GNN) forecasting clinically relevant endpoints (such as blood pressure) and a generative adversarial network (GAN) providing a proof of concept of transcriptomic integrability. Results: We tested our digital twin model on two simulated clinical case studies combining information at organ, tissue, and cellular level. We provided a panoramic overview over current and future patient's conditions by monitoring and forecasting clinically relevant endpoints representing the evolution of patient's vital parameters using the GNN model. We showed how to use the GAN to generate multi-tissue expression data for blood and lung to find associations between cytokines conditioned on the expression of genes in the renin-angiotensin pathway. Our approach was to detect inflammatory cytokines, which are known to have effects on blood pressure and have previously been associated with SARS-CoV-2 infection (e.g., CXCR6, XCL1, and others). Significance: The graph representation of a computational patient has potential to solve important technological challenges in integrating multiscale computational modeling with AI. We believe that this work represents a step forward toward next-generation devices for precision and predictive medicine.

Keywords: digital twin; generative adversarial networks; graph representation learning; monitoring; precision medicine.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Architecture of the digital twin model. The generator receives a noise vector z, and categorical (e.g. tissue type; q) and numerical (e.g. age; r) covariates, and outputs a vector of synthetic data (x^). The critic receives data from two input streams (real, blue; and synthetic, red), a mask m indicating which components of the input vector are missing, and the numerical r and categorical q covariates. The critic produces an unbounded scalar y¯ that quantifies the degree of realism of the input samples from the two input streams. The handcrafted ODE system proposed in Barbiero and Lió (2020) is used to determine a graph representation of patient's physiology. The message passing neural network updates latent node features to estimate global attributes describing the evolution of the underlying physiological system.
Figure 2
Figure 2
The digital twin model. Ordinary differential equations, graph neural networks, and generative adversarial networks are used synergically to model patient's conditions.
Figure 3
Figure 3
Generative Adversarial Network framework. The generator G(z) receives a vector z sampled from a noise prior distribution pz, and generates a synthetic sample xfake. The discriminator D(x) tries to distinguish real samples from fake samples, producing the probability of x coming from the real data distribution. The competition between the two players drives the game and makes both players increasingly better.
Figure 4
Figure 4
Example of how a biological system can be modeled in a graph neural network through differential equations. First, an ordinary differential equation (ODE) system is derived from a biochemical reaction network. Then, the ODE system is solved for different initial conditions generating a set of trajectories for each variable. Finally, a graph neural network aggregates the information coming from neighbor nodes to update the current state of the variable.
Figure 5
Figure 5
Two clinical case studies represented in a projected heart phase-space. The first case study (left) shows the effect of a therapeutic intervention comprising an increased physical exercise, a reduced amount of calorie intake, and the subscription of a daily dosage of Benazepril (5 mg). The second simulation (right) shows the long-term impact on blood pressure of an untreated SARS-CoV-2 infection (red density) and the effects of a therapy including both Benazepril (5 mg/day) and intra venous injection of heparin (5000 μ/ml) (orange density). (Top) Bundle of predicted trajectories can be visualized and monitored in real time in order to investigate patterns in the time domain. The simulation shows blood pressure in heart chambers starting from healthy state conditions. Error bands represent 95% CI (Bottom).
Figure 6
Figure 6
Bootstrapped R2 scores for genes involved in the renin-angiotensin system for lung, heart (left ventricle), kidney (cortex), and pancreas. The input variables are the expressions of genes in whole blood belonging to the chemokine, TNF, and TGF-β pathways.
Figure 7
Figure 7
Bootstrapped R2 scores for several cytokines and receptors for lung, heart (left ventricle), kidney (cortex), and pancreas. For each tissue type, we show the top 20 predicted cytokines. The input variables are the expressions of genes in whole blood belonging to the chemokine, TNF, and TGF-β pathways.
Figure 8
Figure 8
Distribution of missing tissues per GTEx patient. This plot only considers 4 tissue types (whole blood, lung, kidney (cortex), and pancreas).
Figure 9
Figure 9
Pairwise Pearson correlations between genes in the renin-angiotensin system pathway in lung for real (left) and synthetic (right) data. The correlations in the lower and upper matrices are computed from samples with low (61 samples) and high (60 samples) ACE2 expression, respectively. We use dots to label statistically significant correlations (two-sided p-value < 0.05).
Figure 10
Figure 10
Bootstrapped R2 scores for chemokines in blood. The input variables are the expressions of 21 genes belonging to the renin-angiotensin system pathway in lung. This plot shows the top 20 predicted chemokines (out of 170). The transcriptomics data was generated by our GAN. Importantly, some of the top predicted chemokines (e.g., CXCR6) have been previously associated with SARS-CoV-2 (Liao et al., 2020).
Figure 11
Figure 11
Pairwise correlations between inflammatory cytokines in the 4 modeled tissue types. We use dots to label statistically significant correlations (two-sided p-value < 0.05).
Figure 12
Figure 12
Principal component analysis of the multi-tissue expression of 100 synthetic patients for different levels of ACE2 expression. Each line corresponds to a unique patient. For each patient, we fix all the latent covariates and modify the levels of ACE2 in lung. Overexpressing ACE2 leads to changes in the expression of other genes and these changes follow a well-defined trajectory.

References

    1. Abadi M., Barham P., Chen J., Chen Z., Davis A., Dean J., et al. (2016). Tensorflow: a system for large-scale machine learning, in 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16) (Savannah, GA: ), 265–283.
    1. Aguet F., Barbeira A. N., Bonazzola R., Brown A., Castel S. E., Jo B., et al. (2019). The gtex consortium atlas of genetic regulatory effects across human tissues. bioRxiv [Preprint]. 10.1101/787903 - DOI
    1. Arjovsky M., Chintala S., Bottou L. (2017). Wasserstein GAN. arXiv [Preprint]. arXiv:1701.07875.
    1. Bangalore S., Maron D. J., O'Brien S. M., Fleg J. L., Kretov E. I., Briguori C., et al. (2020). Management of coronary disease in patients with advanced kidney disease. N. Engl. J. Med. 382, 1608–1618. 10.1056/nejmoa1915925 - DOI - PMC - PubMed
    1. Barbiero P., Lió P. (2020). The computational patient has diabetes and a covid. arXiv [Preprint]. arXiv:2006.06435.