Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Aug 17:10:e69719.
doi: 10.7554/eLife.69719.

A proteome-wide genetic investigation identifies several SARS-CoV-2-exploited host targets of clinical relevance

Affiliations

A proteome-wide genetic investigation identifies several SARS-CoV-2-exploited host targets of clinical relevance

Mohd Anisul et al. Elife. .

Abstract

Background: The virus SARS-CoV-2 can exploit biological vulnerabilities (e.g. host proteins) in susceptible hosts that predispose to the development of severe COVID-19.

Methods: To identify host proteins that may contribute to the risk of severe COVID-19, we undertook proteome-wide genetic colocalisation tests, and polygenic (pan) and cis-Mendelian randomisation analyses leveraging publicly available protein and COVID-19 datasets.

Results: Our analytic approach identified several known targets (e.g. ABO, OAS1), but also nominated new proteins such as soluble Fas (colocalisation probability >0.9, p=1 × 10-4), implicating Fas-mediated apoptosis as a potential target for COVID-19 risk. The polygenic (pan) and cis-Mendelian randomisation analyses showed consistent associations of genetically predicted ABO protein with several COVID-19 phenotypes. The ABO signal is highly pleiotropic, and a look-up of proteins associated with the ABO signal revealed that the strongest association was with soluble CD209. We demonstrated experimentally that CD209 directly interacts with the spike protein of SARS-CoV-2, suggesting a mechanism that could explain the ABO association with COVID-19.

Conclusions: Our work provides a prioritised list of host targets potentially exploited by SARS-CoV-2 and is a precursor for further research on CD209 and FAS as therapeutically tractable targets for COVID-19.

Funding: MAK, JSc, JH, AB, DO, MC, EMM, MG, ID were funded by Open Targets. J.Z. and T.R.G were funded by the UK Medical Research Council Integrative Epidemiology Unit (MC_UU_00011/4). JSh and GJW were funded by the Wellcome Trust Grant 206194. This research was funded in part by the Wellcome Trust [Grant 206194]. For the purpose of open access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Keywords: COVID-19; apoptosis; epidemiology; genetic colocalization; genetics; genomics; global health; human; mendelian randomization; proteins.

Plain language summary

Individuals who become infected with the virus that causes COVID-19 can experience a wide variety of symptoms. These can range from no symptoms or minor symptoms to severe illness and death. Key demographic factors, such as age, gender and race, are known to affect how susceptible an individual is to infection. However, molecular factors, such as unique gene mutations and gene expression levels can also have a major impact on patient responses by affecting the levels of proteins in the body. Proteins that are too abundant or too scarce may mean the difference between dying from or surviving COVID-19. Identifying the molecular factors in a host that affect how viruses can infect individuals, evade immune defences or trigger severe illness, could provide new ways to treat patients with COVID-19. Such factors are likely to remain constant, even when the virus mutates into new strains. Hence, insights would likely apply across all virus strains, including current strains, such as alpha and delta, and any new strains that may emerge in the future. Using such a ‘natural experiment’ approach, Karim et al. compared the genetic profiles of over 30,000 COVID-19 patients and a million healthy individuals. Nine proteins were found to have an impact on COVID-19 infection and disease severity. Four proteins were ranked as top priorities for potential treatment targets. One protein, called CD209 (also known as DC-SIGN), is involved in how the virus enters the host cells, and had one of the strongest associations with COVID-19. Two proteins, called IL-6R and FAS, were involved in the immune response and could be responsible for the immune over-activation often seen in severe COVID-19. Finally, one protein, called OAS1, formed part of the body’s innate antiviral defence system and appeared to reduce susceptibility to COVID-19. Knowing more about the proteins that influence the severity of COVID-19 opens up new ways to predict, protect and treat patients who may have severe or fatal reactions to infection. Indeed, one of the identified proteins (IL-6R) had already been targeted in recent clinical trials with some encouraging results. Considering CD209 as a potential receptor for the virus could provide another avenue for therapeutics, similar to previously successful approaches to block the virus’ known interaction with a receptor protein. Ultimately, this research could supply an entirely new set of treatment options to help combat the COVID-19 pandemic.

PubMed Disclaimer

Conflict of interest statement

MA, JS, JH, AB, JZ, DO, MC, VE, VG, EM, GW, MG Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners. JS none, ES Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners. ES is also a full-time employee of Bristol-Myers Squibb. MH Dr Holmes has consulted for Boehringer Ingelheim, and in adherence to the University of Oxford’s Clinical Trial Service Unit & Epidemiological Studies Unit (CSTU) staff policy, did not accept personal honoraria or other payments from pharmaceutical companies. JM JM is a full-time employee of Bristol-Myers Squibb and retains stock or stock options in Bristol-Myers Squibb. The author has no other competing interests to declare. TG TG received grants from Biogen and GlaxoSmithKline. The author has no other competing interests to declare. ID Open Targets is a pre-competitive partnership currently involving the Wellcome Sanger Institute, EMBL-EBI, BMS, GSK, and Sanofi. Research is funded by financial and in-kind contributions from each of the partners. ID also received travel costs within the last 36 months from Takeda for speaking at their Reverse Translation Symposium. The author has no other competing interests to declare.

Figures

Figure 1.
Figure 1.. Flowcharts illustrating the process of (A) pan-Mendelian randomisation (MR) and (B) cis-MR and genetic colocalisation.
Both pan- and cis-MR methods used (Sun et al., 2018) as the source of genetic instruments and the UK Biobank downsampled 10k (UKBd10k) individual genotype data as reference panel. We selected near-independent genetic instruments and performed two sample MR analysis using generalised summary data-based Mendelian randomisation that adjusted for residual correlation between instruments. Genetic colocalisation analysis was used to estimate posterior probabilities of shared causal genetic signal between protein and outcomes. A posterior probability of shared causal genetic signal of more than 0.6 (i.e. a PP.H4 or posterior probability for hypothesis 4 > 0.6) was used as evidence of genetic colocalisation. The dashed line separates analysis (above the line) from target curation (below the line). *Only three proteins with pan-MR evidence of association with COVID also had cis-MR evidence support at nominal cis-MR p-value<0.05.
Figure 2.
Figure 2.. Forest plot illustrating associations of genetically predicted plasma protein concentrations with selected COVID-19 phenotypes.
The black point estimates represent odds ratios (ORs) of COVID-19 outcome per standard deviation (SD) increase of genetically predicted protein abundance using genetic instruments from across the genome (pan-Mendelian randomisation [pan-MR]). The blue point estimates represent OR of COVID outcome per SD increase of genetically predicted protein abundance using genetic instruments near or in the gene encoding the protein (cis-MR). Error bars represent 95% confidence intervals (95% CI). The areas of the squares are proportional to the inverse of the variance of the log ORs. For each COVID phenotype, pan-MR associations at FDR 5% were retained. Each row under a COVID phenotype represents a pQTL and includes the number of cases in the COVID phenotype (nCases), the number of SNPs used as genetic instruments for the protein (nSNPs), the posterior probability that protein and COVID traits colocalise (PP.H4), the posterior probability evidence for vs. against shared causal variants (log2(H4/H3)), and the candidate colocalising signal (coloc_SNP). * denotes proteins that have coloc_SNP that are either missense variants or in linkage disequilibrium with missense variants, rendering their effect estimates potentially biased.
Figure 3.
Figure 3.. Proteome-wide association of the ABO signal (rs8176719-insC) in (A) Sun et al. and (B) Emilsson et al. datasets.
The x-axis represents the chromosome for the gene encoding the protein. The y-axis represents the p-value of the per-allele association of rs8176719-insC (or an SNP in high linkage disequilibrium at r2 >0.8 with rs8176719-insC) with the proteins in Sun et al. and Emilsson et al. datasets. The red triangles point downwards and denote the inverse association of the ABO signal with the protein. The blue triangles point upwards and denote the positive association of the ABO signal with the protein. Only proteins that were considered significant at the study-specific Bonferroni-corrected p-value thresholds are displayed in this plot and tabulated in Supplementary file 6. (Supplementary file 6 also reports associations from an additional protein dataset – Suhre et al.).
Figure 4.
Figure 4.. In vitro binding experiments with purified SARS-CoV-2 spike protein confirm human CD209 as a functional binding target.
(A) Human cell lines overexpressing cell-surface CD209 protein gain the ability to specifically bind SARS-CoV-2 spike. The density plots represent flow cytometry measurements of HEK293 cells stained with fluorescently conjugated tetramers of SARS-CoV-2 spike protein or a tag-only protein control. Blue distributions are cells with surface CD209, while red are control-transfected cells. Light shades indicate a negative control tetramer that was used for staining, while dark shades are stained with spike protein. (B) Purified recombinant CD209 ectodomains interact with the spike protein of SARS-CoV-2 in an in vitro binding assay. A dilution series of purified spike protein was applied over immobilised CD209, ACE2 (positive control), or a negative control protein. A plot of quantified absorbance is displayed alongside a representative assay plate. Error bars are standard deviations of two replicates.
Figure 5.
Figure 5.. Forest plot illustrating associations of genetically predicted plasma protein concentrations that colocalised with the selected COVID-19 phenotypes (PP.H4 > 0.6).
The black point estimates represent odds ratios (ORs) of COVID-19 outcome per standard deviation (SD) increase of genetically predicted protein abundance using single-SNP colocalising signals (coloc_SNP). Error bars represent the 95% confidence interval around the estimates. The areas of the squares are proportional to the inverse of the variance of the log ORs. * denotes proteins that have coloc_SNP that are either missense variants or in linkage disequilibrium with missense variants, rendering their effect estimates potentially biased.
Figure 6.
Figure 6.. Regional association plots arranged to mirror the genetic associations of the colocalising proteins (FAS, ABO, and OAS1) with their respective COVID-19 phenotypes.
The top panels represent genetic associations of the selected COVID-19 phenotypes, and the bottom panels represent genetic associations of the protein from the Sun et al. dataset. The x-axis in each panel represents the genomic locations in or around the genes encoding FAS, ABO, and OAS1. The y-axis in each panel represents the p-value of the genetic associations.

Similar articles

Cited by

References

    1. Achuthan A. Glucocorticoids promote apoptosis of proinflammatory monocytes by inhibiting ERK activity. Cell Death & Disease. 2018;9:267. doi: 10.1038/s41419-018-0332-4. - DOI - PMC - PubMed
    1. Amraie R. Cd209l/l-Sign and Cd209/Dc-Sign Act as Receptors for Sars-Cov-2 and Are Differentially Expressed in Lung and Kidney Epithelial and Endothelial Cells. bioRxiv. 2020 doi: 10.1101/2020.06.22.165803. - DOI
    1. Anisul M. Covid_paper. swh:1:rev:4ab9f9b17ffde57f7831ea555394290ba240a2b9Software Heritage. 2021 https://github.com/mohdkarim/covid_paper
    1. Anthony C G, Paul R M. Interleukin-6 Receptor Antagonists in Critically Ill Patients with COVID-19 Preliminary Report. medRxiv. 2021 doi: 10.1101/2021.01.07.21249390. - DOI
    1. Arguinano A-AA, Ndiaye NC, Masson C, Visvikis-Siest S. Pleiotropy of ABO gene: Correlation of RS644234 with e-selectin and lipid levels. Clinical Chemistry and Laboratory Medicine. 2018;56:748–754. doi: 10.1515/cclm-2017-0347. - DOI - PubMed

Publication types