Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 31:13:868015.
doi: 10.3389/fgene.2022.868015. eCollection 2022.

StarGazer: A Hybrid Intelligence Platform for Drug Target Prioritization and Digital Drug Repositioning Using Streamlit

Affiliations

StarGazer: A Hybrid Intelligence Platform for Drug Target Prioritization and Digital Drug Repositioning Using Streamlit

Chiyun Lee et al. Front Genet. .

Abstract

Target prioritization is essential for drug discovery and repositioning. Applying computational methods to analyze and process multi-omics data to find new drug targets is a practical approach for achieving this. Despite an increasing number of methods for generating datasets such as genomics, phenomics, and proteomics, attempts to integrate and mine such datasets remain limited in scope. Developing hybrid intelligence solutions that combine human intelligence in the scientific domain and disease biology with the ability to mine multiple databases simultaneously may help augment drug target discovery and identify novel drug-indication associations. We believe that integrating different data sources using a singular numerical scoring system in a hybrid intelligent framework could help to bridge these different omics layers and facilitate rapid drug target prioritization for studies in drug discovery, development or repositioning. Herein, we describe our prototype of the StarGazer pipeline which combines multi-source, multi-omics data with a novel target prioritization scoring system in an interactive Python-based Streamlit dashboard. StarGazer displays target prioritization scores for genes associated with 1844 phenotypic traits, and is available via https://github.com/AstraZeneca/StarGazer.

Keywords: data integration; drug discovery; hybrid intelligence; multi-omics; repositioning; stargazer; streamlit; target prioritization.

PubMed Disclaimer

Conflict of interest statement

Authors CL, AP, VG, RNH, EP, AF, SP, WY, MH, A-SS, KT, BS, TC, JM, FMK, and KS are or were employed by AstraZeneca. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
The StarGazer drug target prioritization framework considers the following five features for each of the 1844 diseases in StarGazer’s disease list (Xue et al., 2018):—the odds ratios of association between targets and phenotypic variants of interest from GWAS and PheWAS data (Pushpakom et al., 2019);—the target-disease association scores from Open Targets (Li et al., 2016);—the druggability data of genes of interest from Pharos (Shameer et al., 2015);—the degree of nodes in protein-protein interaction networks of genes of interest from STRING; and (Lee and Bhakta, 2021)—the presence of the gene variant of interest in both PheWAS and GWAS datasets. All data, except the PheWAS and GWAS data, are loaded in real-time by API calls and therefore present the latest evidence for drug repositioning strategies. The above five features are then integrated to provide a singular numerical StarGazer score which quantifies the drug repositioning potential of a gene. StarGazer is built on the Python-based Streamlit platform, which is largely used for building sleek and modern web applications for machine-learning and data science.
FIGURE 2
FIGURE 2
The StarGazer interface after searching “HLA-G” in Gene mode. At p = 0.05, the first allele returned is rs11206510. The color-coded bar chart shows the odds ratio of association of the allele with each phenotype. The table on the right is the same data tabulated which can be downloaded as a csv file. The StarGazer Variant mode is similar in appearance.
FIGURE 3
FIGURE 3
The StarGazer interface after searching “Multiple sclerosis” in PheWAS mode. At p = 0.05, 7.37% of genes with associations with multiple sclerosis were categorized as Tclin, i.e., already targets of FDA-approved drugs. The distribution of genes in each druggability level is shown by pie chart and scatter plot, the latter of which also showing the odds ratios of each allele of each gene. Some gene names are not shown. This data is re-analyzed to show only risk alleles, or only protective alleles. Tabulated data can be visualized and downloaded. The StarGazer modes, GWAS and GWAS-PheWAS Union, are similar in appearance.
FIGURE 4
FIGURE 4
The StarGazer interface after searching “Type 2 diabetes” in GWAS-PheWAS Intersection mode. At p = 0.05, 23 SNPs were identified to have associations in both PheWASs and GWASs. Top left: pie chart displaying the proportion of SNPs that were identified in either PheWAS or GWAS datasets, or in both datasets. Top right: pie chart displaying druggability information of the genes of these SNPs. Tclin in red implies genes already have drugs targeting them available on the market, whilst Tchem, Tbio, Tdark, and None, indicate progressively decreasing levels of druggability. Bottom left: scatter plot highlighting individually reported odds-ratios of associations of SNPs from various GWASs. Bottom right: a protein-protein interaction network constructed from the genes of alleles detected in both GWASs and the PheWAS catalog. The gene ontology enrichment analysis feature is not shown in the figure.
FIGURE 5
FIGURE 5
The StarGazer interface after searching “ASCVD” in Protein-protein interaction mode. Protein-protein interaction networks are shown of all alleles, risk alleles, and protective alleles. The node degree of the genes of these alleles are computed, and gene ontology enrichment analysis is performed on the right.
FIGURE 6
FIGURE 6
The StarGazer interface after searching “Breast cancer” in Disease Target Prioritization mode. At p = 0.05, 140 genes are returned to have association with breast cancer. Genes are ranked in StarGazer score, which describes how suitable a gene is for drug repositioning. The subsequent five columns are the individual scores of the five features extracted from all of the data that contribute to the StarGazer score. Data are separated into all alleles, risk alleles, then protective alleles, and can be downloaded as csv files.

Similar articles

Cited by

References

    1. Abu-Doleh A. A., Al-Jarrah O. M., Alkhateeb A. (2012). Protein Contact Map Prediction Using Multi-Stage Hybrid Intelligence Inference Systems. J. Biomed. Inf. 45 (1), 173–183. 10.1016/j.jbi.2011.10.008 - DOI - PubMed
    1. Adikusuma W., Irham L. M., Chou W.-H., Wong H. S.-C., Mugiyanto E., Ting J., et al. (2021). Drug Repurposing for Atopic Dermatitis by Integration of Gene Networking and Genomic Information. Front. Immunol. 12, 724277. 10.3389/fimmu.2021.724277 - DOI - PMC - PubMed
    1. Akata Z., Eiben G., Fokkens A., Grossi D., Hindriks K., Hoos H., et al. (2020). A Research Agenda for Hybrid Intelligence: Augmenting Human Intellect with Collaborative, Adaptive, Responsible, and Explainable Artificial Intelligence. Computer 53, 18–28. 10.1109/mc.2020.2996587 - DOI
    1. Armstrong J. F., Faccenda E., Harding S. D., Pawson A. J., Southan C., Sharman J. L., et al. (2019). The IUPHAR/BPS Guide to PHARMACOLOGY in 2020: Extending Immunopharmacology Content and Introducing the IUPHAR/MMV Guide to MALARIA PHARMACOLOGY. Nucleic Acids Res. 48, D1006–D1021. 10.1093/nar/gkz951 - DOI - PMC - PubMed
    1. Ashburn T. T., Thor K. B. (2004). Drug Repositioning: Identifying and Developing New Uses for Existing Drugs. Nat. Rev. Drug Discov. 3 (8), 673–683. 10.1038/nrd1468 - DOI - PubMed

LinkOut - more resources