Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar;69(1):e92.
doi: 10.1002/cpbi.92.

How to Illuminate the Druggable Genome Using Pharos

Affiliations

How to Illuminate the Druggable Genome Using Pharos

Timothy Sheils et al. Curr Protoc Bioinformatics. 2020 Mar.

Abstract

Pharos is an integrated web-based informatics platform for the analysis of data aggregated by the Illuminating the Druggable Genome (IDG) Knowledge Management Center, an NIH Common Fund initiative. The current version of Pharos (as of October 2019) spans 20,244 proteins in the human proteome, 19,880 disease and phenotype associations, and 226,829 ChEMBL compounds. This resource not only collates and analyzes data from over 60 high-quality resources to generate these types, but also uses text indexing to find less apparent connections between targets, and has recently begun to collaborate with institutions that generate data and resources. Proteins are ranked according to a knowledge-based classification system, which can help researchers to identify less studied "dark" targets that could be potentially further illuminated. This is an important process for both drug discovery and target validation, as more knowledge can accelerate target identification, and previously understudied proteins can serve as novel targets in drug discovery. Two basic protocols illustrate the levels of detail available for targets and several methods of finding targets of interest. An Alternate Protocol illustrates the difference in available knowledge between less and more studied targets. © 2020 by John Wiley & Sons, Inc. Basic Protocol 1: Search for a target and view details Alternate Protocol: Search for dark target and view details Basic Protocol 2: Filter a target list to get refined results.

Keywords: bioinformatics; dark genome; disease; drug discovery; drug targets; phenotype; proteins; target validation.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Main search page for Pharos, with autocomplete functionality visible.
Figure 2:
Figure 2:
Primary Search results/browse page layout. Main results are in a pageable table (A). The left-hand column (B) contains multiple fields to filter on, similar to an e-commerce site. The donut chart on the top half of the screen (C) also shows a proportional breakdown of the filterable properties and is also interactive.
Figure 3:
Figure 3:
Brief metadata is available for each target, which includes several identifiers such as: Target development level, TDL (Oprea, Bologa, et al., 2018), target family, computed target novelty (TIN-X) score (Cannon et al., 2017), fractional publication count (Pletscher-Frankild, Palleja, Tsafou, Binder, & Jensen, 2015), available antibodies (from antibodypedia.com), listed protein-protein interactions (Fabregat et al., 2016; Huttlin et al., 2017; Szklarczyk et al., 2019) and knowledge availability (based on Harmonizome (Rouillard et al., 2016)).
Figure 4:
Figure 4:
Relevant diseases and ligands are displayed in separate pageable lists.
Figure 5:
Figure 5:
Target details view. The density of sections is dependent on the data available. The left side column (A) acts as section navigation and allows the user to quickly jump to areas of interest.
Figure 6:
Figure 6:
Target Summary overview with protein and gene identifiers, illumination graph and knowledge table.
Figure 7:
Figure 7:
Expanded view of the illumination graph.
Figure 8:
Figure 8:
Development Level Summary shows previous development milestones reached, as well as progress towards incomplete milestones. CDK13 is a Tchem target, which means that multiple active ligands have been discovered, but no approved drugs as of yet. It has also been fairly well published about, both in text-mined PubMed literature reviews, and GeneRIF annotations (Jimeno-Yepes, Sticco, Mork, & Aronson, 2013). Its molecular function (from GO Gene Ontology (Ashburner et al., 2000)) is also fairly well known.
Figure 9:
Figure 9:
Resources available from research funded by the IDG program
Figure 10:
Figure 10:
Active Ligands section.
Figure 11:
Figure 11:
Collapsed disease associations view
Figure 12:
Figure 12:
Shown is one of several available line charts that show the frequency of publication for a target.
Figure 13:
Figure 13:
Common target properties are shown, and a link to a list of common targets.
Figure 14:
Figure 14:
List of cyclin-dependent protein serine-threonine kinase activity targets as annotated by their GO Function.
Figure 15:
Figure 15:
Shows the same list of 30 targets from Figure 14, this time sorted by knowledge availability (A).
Figure 16:
Figure 16:
The same list as Figure 14, this time filtered by “Tdark”, leaving 1 target. When more targets are available, it is possible to combine filter values to refine large lists.
Figure 17:
Figure 17:
Protein Summary panel of CDKL4, an understudied target.
Figure 18:
Figure 18:
IDG Development Level Summary of CDKL4, a dark target.
Figure 19:
Figure 19:
Sparse publication information of a dark target
Figure 20:
Figure 20:
Navigation bar header as seen on the Pharos home page. Subsequent pages within Pharos will lack the background image.
Figure 21:
Figure 21:
Main target browse page
Figure 22:
Figure 22:
Expanded filter category panel.
Figure 23:
Figure 23:
Refined category filter list.
Figure 24:
Figure 24:
Target list reduced from 20244 to 492 targets.
Figure 25:
Figure 25:
Select “GPCR” from target family to further reduce the list.
Figure 26:
Figure 26:
The donut chart above the target list can also be used to filter results.
Figure 27:
Figure 27:
Final list of 5 GPCR targets with “breast cancer” as a GWAS trait that are expressed in female tissues.

References

    1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, … Yeh LS (2004). UniProt: the Universal Protein knowledgebase. Nucleic Acids Res, 32(Database issue), D115–119. doi:10.1093/nar/gkh131 - DOI - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, … Sherlock G (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 25(1), 25–29. doi:10.1038/75556 - DOI - PMC - PubMed
    1. Cannon DC, Yang JJ, Mathias SL, Ursu O, Mani S, Waller A, … Oprea TI (2017). TIN-X: target importance and novelty explorer. Bioinformatics, 33(16), 2601–2603. doi:10.1093/bioinformatics/btx200 - DOI - PMC - PubMed
    1. Edwards AM, Isserlin R, Bader GD, Frye SV, Willson TM, & Yu FH (2011). Too many roads not taken. Nature, 470(7333), 163–165. doi:10.1038/470163a - DOI - PubMed
    1. Fabregat A, Sidiropoulos K, Garapati P, Gillespie M, Hausmann K, Haw R, … D’Eustachio P (2016). The Reactome pathway Knowledgebase. Nucleic Acids Res, 44(D1), D481–487. doi:10.1093/nar/gkv1351 - DOI - PMC - PubMed

Publication types

LinkOut - more resources