Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2020 Dec 22:2020.12.20.20248572.
doi: 10.1101/2020.12.20.20248572.

An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility

Affiliations

An atlas connecting shared genetic architecture of human diseases and molecular phenotypes provides insight into COVID-19 susceptibility

Liuyang Wang et al. medRxiv. .

Update in

Abstract

While genome-wide associations studies (GWAS) have successfully elucidated the genetic architecture of complex human traits and diseases, understanding mechanisms that lead from genetic variation to pathophysiology remains an important challenge. Methods are needed to systematically bridge this crucial gap to facilitate experimental testing of hypotheses and translation to clinical utility. Here, we leveraged cross-phenotype associations to identify traits with shared genetic architecture, using linkage disequilibrium (LD) information to accurately capture shared SNPs by proxy, and calculate significance of enrichment. This shared genetic architecture was examined across differing biological scales through incorporating data from catalogs of clinical, cellular, and molecular GWAS. We have created an interactive web database (interactive Cross-Phenotype Analysis of GWAS database (iCPAGdb); http://cpag.oit.duke.edu) to facilitate exploration and allow rapid analysis of user-uploaded GWAS summary statistics. This database revealed well-known relationships among phenotypes, as well as the generation of novel hypotheses to explain the pathophysiology of common diseases. Application of iCPAGdb to a recent GWAS of severe COVID-19 demonstrated unexpected overlap of GWAS signals between COVID-19 and human diseases, including with idiopathic pulmonary fibrosis driven by the DPP9 locus. Transcriptomics from peripheral blood of COVID-19 patients demonstrated that DPP9 was induced in SARS-CoV-2 compared to healthy controls or those with bacterial infection. Further investigation of cross-phenotype SNPs with severe COVID-19 demonstrated colocalization of the GWAS signal of the ABO locus with plasma protein levels of a reported receptor of SARS-CoV-2, CD209 (DC-SIGN), pointing to a possible mechanism whereby glycosylation of CD209 by ABO may regulate COVID-19 disease severity. Thus, connecting genetically related traits across phenotypic scales links human diseases to molecular and cellular measurements that can reveal mechanisms and lead to novel biomarkers and therapeutic approaches.

Keywords: Hi-HOST; LD-score; PheWAS; colocalization; cross-phenotype association; gout; idiopathic pulmonary fibrosis; macular telangiectasia; pleiotropy; rs12610495; rs2869462; rs505922.

PubMed Disclaimer

Conflict of interest statement

Competing interests The author(s) declare no competing interests.

Figures

Figure 1.
Figure 1.. An improved method for finding shared genetic architecture of human traits.
(A) The overall framework of the iCPAGdb pipeline. GWAS summary statistics (from published GWAS datasets or from user-uploaded GWAS) undergo LD clumping to obtain a lead variant for each signal below a specified p-value threshold. These SNPs are queried against an LD proxy database generated from 1000 Genomes African, Asian, or European population to identify cross-phenotype associations through direct overlap or LD proxy at R2 > 0.4. Significance of overlap for each trait pair is calculated using Fisher’s exact test. Outputs can be visualized/downloaded from the iCPAGdb web browser. (B) Comparison of the number of shared SNPs for each NHGRI-EBI GWAS catalog trait pair identified through direct overlap vs. both direct and indirect (LD-proxy) overlap. (C) iCPAGdb detected more significant cross-phenotypes associations than CPAG1 at FDR < 0.1. Expansion of the NHGRI-EBI GWAS catalog and improvements in capturing by LD proxy in iCPAGdb fueled a large increase in detected cross-phenotype associations across human traits. Comparisons between CPAG1 and iCPAGdb on the same 2013 dataset are in Fig. S3. (D) Circle plot of cross-phenotype associations detected by iCPAGdb in the NHGRI-EBI GWAS catalog. After excluding compound phenotypes (phenotypes described by NHGRI-EBI GWAS catalog as > 1 comma-separated phenotype in their ontology), a total of 1709 traits involved in a total of 53314 cross-phenotype associations were left. These were categorized into 17 EFO Parental groups. Inner ribbons link phenotypes connected by cross-phenotype associations with the width of ribbon corresponding to the number of cross-phenotype associations. The axis outside the circle represents the cumulative number of associations for each group vs all other groups. (E) Comparison of genetic correlation from LD score regression (LDSC) and the Chao-Sorensen similarity index implemented in iCPAG demonstrates significant correlation. The genetic correlation rg of 24 diseases/trait were obtained from (Bulik-Sullivan et al., 2015a). Since Chao-Sorensen values are bounded from 0 to 1 and rg ranges from −1 to 1, we used the absolute value of rg here. Colored * indicates significant trait-pair for LDSC, iCPAGdb, or both at false discovery rate of 0.1. (F) A model demonstrating how SNPs regulate uric acid levels to impact the development of kidney stones and gout. (G) Riverplot of gout cross-phenotype associations generated from iCPAGdb output shows causal connections, comorbid outcomes, and regulators of disease. Mapped genes for SNPs associated with gout are shown on the left and connected to other NHGRI-EBI GWAS phenotypes grouped by EFO on the right.
Figure 2.
Figure 2.. iCPAGdb integrates GWAS of different scales to reveal biological insight.
(A) Multi-dataset network of cross-phenotype associations detected by iCPAGdb. Phenotypes that demonstrated significant overlap (FDR ≤ 0.1) are color-coded in the indicated colors. (B) Riverplot of macular telangiectasia type 2 (MacTel type 2) cross-phenotype associations generated from iCPAGdb shows causal connections, comorbid outcomes, and regulators of disease. (C) Cross-phenotype associations connecting MacTel type 2 and serine. One locus demonstrated direct SNP overlap (rs715). A second locus demonstrated indirect overlap based on 4 SNPs in LD as visualized in the heatmap color-coded by LD. (D) A model for how SNPs regulate serine levels to impact pathogenesis of MacTel type 2 based on iCPAGdb and prior work described in the text. (E) Regional Miami colocalization plot demonstrates a genetic locus that impacts both CXCL10 level in lymphoblastoid cell lines following Chlamydia trachomatis infection and CXCL9 (MIG) levels in whole blood. (F) Comparison of −log10(p value) for GWAS of CXCL10 following Chlamydia trachomatis infection and levels of CXCL9 (MIG) in whole blood. The lead SNP in the region for each phenotype is marked. (G) Scatter plot demonstrates a highly positive correlation of the effect coefficients of cellular CXCL10 after Chlamydia trachomatis infection and of SNPs associated with blood CXCL9 levels. Each dot represents a SNP which has p value < 0.01 for both phenotypes. A total of 413 SNPs from a 4-mb window surrounding the leading SNP rs2869462 was selected. The blue vertical or red horizontal bar shows the standard error of the beta value for each SNP.
Figure 3.
Figure 3.. Cross-phenotype association of ABO reveals a possible role for CD209 in severe COVID-19.
(A) A network of genetic associations involving severe COVID-19. Each node represents either a disease/trait (filled circles) or a gene (dark blue diamond). The ABO locus was associated with multiple other diseases and levels of specific proteins, while DPP9 connects COVID-19 only with IPF and interstitial lung disease (idiopathic interstitial pneumonia). (B) Regional Miami colocalization plot demonstrates the ABO locus impacts both CD209 protein levels and risk of severe COVID-19. (C) A significant positive correlation for effect size of SNPs in the ABO locus on CD209 protein levels and risk of severe COVID-19. (D) Model of how ABO may affect CD209 and severe COVID-19.
Figure 4.
Figure 4.. Cross-phenotype analysis and COVID-19 patient transcriptomics reveals a role for DPP9 in severe COVID-19.
(A) Lung eQTL data from GTEx shows rs12610495 “G” allele is associated with reduced expression of DPP9. (B) Regional Miami colocalization plot demonstrates the DPP9 locus impacts both idiopathic pulmonary fibrosis and risk of severe COVID-19. (C) A significant positive correlation for effect size of SNPs in the DPP9 locus on idiopathic pulmonary fibrosis and risk of severe COVID-19. (D) Model of how DPP9 may affect idiopathic pulmonary fibrosis and risk of severe COVID-19. (E) DPP9 expression in peripheral blood is significantly higher in COVID-19 patients compared to healthy and bacteria-infected patients. The p values were calculated using the Wilcoxon rank-sum test. (F) COVID-19 patients demonstrate significantly higher DPP9 expression compared to healthy controls during early (days 1–10), middle (days 11–20) and late (21+ days) stages of SARS-CoV-2 infection. The p values were calculated using the Wilcoxon rank-sum test. (G) DPP9 demonstrates increased expression during recovery from COVID-19. A total of 11 patients were measured sequentially at enrollment (day 0), day 7, and day 14. The colored dash line connects measurements from the same patient across time points. P value was calculated using Friedman test. (H) Decreased symptom severity scores of COVID-19 patients over time. The eleven subjects in G were assessed for symptom severity at day 0, 7 and 14. The colored dash line connects measurements from the same patient across time points. P value was calculated using Friedman test.

References

    1. Ahola-Olli A.V., Wurtz P., Havulinna A.S., Aalto K., Pitkanen N., Lehtimaki T., Kahonen M., Lyytikainen L.P., Raitoharju E., Seppala I., et al. (2017). Genome-wide Association Study Identifies 27 Loci Influencing Concentrations of Circulating Cytokines and Growth Factors. Am J Hum Genet 100, 40–50. - PMC - PubMed
    1. Albanez S., Ogiwara K., Michels A., Hopman W., Grabell J., James P., and Lillicrap D. (2016). Aging and ABO blood type influence von Willebrand factor and factor VIII levels through interrelated mechanisms. J Thromb Haemost 14, 953–963. - PMC - PubMed
    1. Allen R.J., Guillen-Guio B., Oldham J.M., Ma S.F., Dressen A., Paynton M.L., Kraven L.M., Obeidat M., Li X., Ng M., et al. (2020). Genome-Wide Association Study of Susceptibility to Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 201, 564–574. - PMC - PubMed
    1. Amraie R., Napoleon M.A., Yin W., Berrigan J., Suder E., Zhao G., Olejnik J., Gummuluru S., Muhlberger E., Chitalia V., et al. (2020). CD209L/L-SIGN and CD209/DC-SIGN act as receptors for SARS-CoV-2 and are differentially expressed in lung and kidney epithelial and endothelial cells. bioRxiv.
    1. Amundadottir L., Kraft P., Stolzenberg-Solomon R.Z., Fuchs C.S., Petersen G.M., Arslan A.A., Bueno-de-Mesquita H.B., Gross M., Helzlsouer K., Jacobs E.J., et al. (2009). Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer. Nat Genet 41, 986–990. - PMC - PubMed

Publication types