Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 24;13(5):e0197595.
doi: 10.1371/journal.pone.0197595. eCollection 2018.

Beyond degree and betweenness centrality: Alternative topological measures to predict viral targets

Affiliations

Beyond degree and betweenness centrality: Alternative topological measures to predict viral targets

Prajwal Devkota et al. PLoS One. .

Abstract

The availability of large-scale screens of host-virus interaction interfaces enabled the topological analysis of viral protein targets of the host. In particular, host proteins that bind viral proteins are generally hubs and proteins with high betweenness centrality. Recently, other topological measures were introduced that a virus may tap to infect a host cell. Utilizing experimentally determined sets of human protein targets from Herpes, Hepatitis, HIV and Influenza, we pooled molecular interactions between proteins from different pathway databases. Apart from a protein's degree and betweenness centrality, we considered a protein's pathway participation, ability to topologically control a network and protein PageRank index. In particular, we found that proteins with increasing values of such measures tend to accumulate viral targets and distinguish viral targets from non-targets. Furthermore, all such topological measures strongly correlate with the occurrence of a given protein in different pathways. Building a random forest classifier that is based on such topological measures, we found that protein PageRank index had the highest impact on the classification of viral (non-)targets while proteins' ability to topologically control an interaction network played the least important role.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Enrichment of human targets of Hepatitis, Herpes, HIV, Influenza and other viruses as a function of pathway participation.
In (A), we determined the overlap between sets of viral targets of Hepatitis, Herpes, HIV, Influenza and other viruses, determining the corresponding Jaccard indices. We observed that targets of Herpes, Influenza and other viruses overlapped strongly. In (B), we determined the occurrence of proteins in different pathways. The cumulative frequency distribution thus obtained featured a heavy tail, indicating that a small minority of proteins appeared in a large number of pathways and vice versa. In (C), we determined the enrichment of viral targets as a function of targeted protein’s occurrence in different pathways by randomly sampling viral target sets 10,000 times. We found that targets appeared in an increasing number of pathways.
Fig 2
Fig 2. Enrichment of viral targets in sets of hubs and bottleneck nodes.
(A) Defining the top 20% of most connected proteins as hubs, we determined the enrichment of targets of Hepatitis, Herpes, HIV, Influenza and other viruses in such sets. Randomly sampling sets of targeted proteins 10,000 times, we observed that targets were significantly enriched in the set of hubs and vice versa (P<10−4). In (B), we defined the top 20% of proteins with highest betweeness as bottleneck nodes. Randomly sampling sets of targeted proteins 10,000 times, we found that bottleneck protein preferably were targeted by viruses while the opposite held for non-bottleneck proteins (P<10−4).
Fig 3
Fig 3. Network controllers and proteins with high protein page rank are enriched with viral targets.
In (A) we determined indispensable, neutral and dispensable proteins in the underlying protein interaction network. Randomizing such sets 10,000 times, we observed that proteins that are indispensable for the control of the underlying network preferably occurred in an increasing number of pathways. In turn, we found the opposite for dispensable proteins. (B) Randomizing sets of proteins that are targeted by Hepatitis, Herpes, HIV, Influenza and other viruses 10,000 times, we observed that indispensable proteins are preferably targeted by viruses (P<10−4) while the opposite held for dispensable nodes. (C) Randomizing the set of top PageRank proteins, we determined their enrichment in sets of indispensable, neutral and dispensable proteins. We observed that indispensable and neutral nodes significantly accumulated top PageRank proteins. In the inset, we observed that proteins in an increasing number of pathways were enriched with top PageRank proteins. (D) Randomizing sets of targets of Hepatitis, Herpes, HIV, Influenza and other viruses 10,000 times we observed that proteins with highest protein PageRank were significantly targeted (P<10−4).
Fig 4
Fig 4. Prediction of viral targets.
(A) The heatmap indicated Pearson correlation values between the distributions of degree, betweenness centrality, number of pathways a protein is involved in, protein PageRank index and indispensability of a protein. Notably, degree, protein PageRank, betweenness centrality and appearance in pathways appeared best correlated while indispensability of proteins showed lowest levels of correlation with other topological measures. (B) Considering target sets of Hepatitis, Herpes, HIV, Influenza and other viruses, we randomly sampled sets of non-targeted proteins of equal size. Determining the area under the ROC curves (AUC), we observed that protein PageRank index and pathway participation of a protein allowed the most thorough classification of (non-)targets. (C) As a corollary, we utilized all five topological measures to predict viral targets using a random forest. We found that protein PageRank had the highest impact on the classification process, a result that was independent of the underlying virus. In (D), we randomly sampled sets of non-targeted proteins 1,000 times that were equal in size to the set of HIV targets and determined the area under the ROC curve (AUC) of the classification process with a random forest. In particular, we predicted if a protein was (not) targeted as a function of the three most (protein PageRank index, degree and pathway appearance) and least important topological features (betweenness centrality, pathway appearance, control). Notably, the distributions of AUC values thus obtained were statistically significant (Student’s t-test, P < 10−20), suggesting that most important features allowed a significantly better classification result.

References

    1. Uetz P, Dong YA, Zeretzke C, Atzler C, Baiker A, Berger B, et al. Herpesviral protein networks and their interaction with the human proteome. Science. 2006;311(5758):239–42. doi: 10.1126/science.1116804 - DOI - PubMed
    1. Calderwood MA, Venkatesan K, Xing L, Chase MR, Vazquez A, Holthaus AM, et al. Epstein-Barr virus and virus human protein interaction maps. Proceedings of the National Academy of Sciences of the United States of America. 2007;104(18):7606–11. doi: 10.1073/pnas.0702332104 - DOI - PMC - PubMed
    1. Shapira SD, Gat-Viks I, Shum BO, Dricot A, de Grace MM, Wu L, et al. A physical and regulatory map of host-influenza interactions reveals pathways in H1N1 infection. Cell. 2009;139(7):1255–67. doi: 10.1016/j.cell.2009.12.018 - DOI - PMC - PubMed
    1. Konig R, Zhou Y, Elleder D, Diamond TL, Bonamy GM, Irelan JT, et al. Global analysis of host-pathogen interactions that regulate early-stage HIV-1 replication. Cell. 2008;135(1):49–60. doi: 10.1016/j.cell.2008.07.032 - DOI - PMC - PubMed
    1. Rozenblatt-Rosen O, Deo RC, Padi M, Adelmant G, Calderwood MA, Rolland T, et al. Interpreting cancer genomes using systematic host network perturbations by tumour virus proteins. Nature. 2012;487(7408):491–5. doi: 10.1038/nature11288 - DOI - PMC - PubMed