Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 25:12:e17470.
doi: 10.7717/peerj.17470. eCollection 2024.

TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets

Affiliations

TIN-X version 3: update with expanded dataset and modernized architecture for enhanced illumination of understudied targets

Vincent T Metzger et al. PeerJ. .

Abstract

TIN-X (Target Importance and Novelty eXplorer) is an interactive visualization tool for illuminating associations between diseases and potential drug targets and is publicly available at newdrugtargets.org. TIN-X uses natural language processing to identify disease and protein mentions within PubMed content using previously published tools for named entity recognition (NER) of gene/protein and disease names. Target data is obtained from the Target Central Resource Database (TCRD). Two important metrics, novelty and importance, are computed from this data and when plotted as log(importance) vs. log(novelty), aid the user in visually exploring the novelty of drug targets and their associated importance to diseases. TIN-X Version 3.0 has been significantly improved with an expanded dataset, modernized architecture including a REST API, and an improved user interface (UI). The dataset has been expanded to include not only PubMed publication titles and abstracts, but also full-text articles when available. This results in approximately 9-fold more target/disease associations compared to previous versions of TIN-X. Additionally, the TIN-X database containing this expanded dataset is now hosted in the cloud via Amazon RDS. Recent enhancements to the UI focuses on making it more intuitive for users to find diseases or drug targets of interest while providing a new, sortable table-view mode to accompany the existing plot-view mode. UI improvements also help the user browse the associated PubMed publications to explore and understand the basis of TIN-X's predicted association between a specific disease and a target of interest. While implementing these upgrades, computational resources are balanced between the webserver and the user's web browser to achieve adequate performance while accommodating the expanded dataset. Together, these advances aim to extend the duration that users can benefit from TIN-X while providing both an expanded dataset and new features that researchers can use to better illuminate understudied proteins.

Keywords: Disease ontology; Diseases; Drug targets; Illuminating the druggable genome (IDG); Named entity recognition; Proteins; T_dark; Target development level (TDL); Text mining; Understudied protein.

PubMed Disclaimer

Conflict of interest statement

Daniel C. Cannon is employed by Elevato Digital.

Figures

Figure 1
Figure 1. TIN-X informatics workflow.
Schematic showing data sources and the flow of information within the TIN-X informatics workflow. Users access the TIN-X user interface (UI) via a web browser. Unlike earlier versions of TIN-X, the UI and the REST API are separate components. User activity on the TIN-X public web application results in API requests which in turn access the TIN-X database. Users can query the data directly via the REST API, which is supported by Swagger documentation. The TIN-X Database has been upgraded from an instance of mySQL to Amazon RDS. Together, the TIN-X UI, API, and database are all hosted in the cloud using Amazon Web Services. Importantly, the TIN-X Database relies on TCRD for target data and the JensenLab DISEASES resource for text-mined PubMed content. Blue arrows depict the flow of data from these two major sources to the TIN-X Database. Several specific web technologies are highlighted adjacent to each major component of the TIN-X application.
Figure 2
Figure 2. Using TIN-X 3.0: browsing by disease or by target.
(A) Users can Browse Diseases by exploring the nested drop-down menus on the left or by typing the name of the disease of interest into the search/filter field in the top left. In the main plot area, each point is a Target associated with the disease selected in this example (bacterial infectious disease). These points are colored based on Target Development Level (TDL), and the shape of each point indicates the IDG Family, as indicated on the legend at the bottom. The next screenshot shows a user searching for a specific Target among the many that exist. The autocomplete suggests various Toll-like receptor targets, which if selected, highlight the corresponding point on the plot. In the upper-right, Filters can be applied to add or remove points from the plot based on TDL category and/or IDG Family. (B) In Browse Targets mode, users can search for a target of interest by using the autocomplete search or by browsing through the menus. The main plot area depicts each associated disease and is colored based on the category of disease. The next screenshots on the right show the use of the autocomplete search for diseases among the results shown in the plot. Hovering over a point reveals details about the disease of interest, in this example pulmonary fibrosis. The slider on the far right of this panel appears in the upper right portion of the scatterplot view. Manipulating the slider either increases or decreases the number of associated diseases that are shown, with the default being 300. By default, only the 300 most-interesting articles (according to NDS rankings) are displayed, however the user is free to adjust the number of results plotted using this slider.
Figure 3
Figure 3. The TIN-X UI now has a new plot-view to browse targets or diseases.
(A) In addition to the plot-view, the new TIN-X UI now features a table view that contains all the same information as the corresponding plot view. Users can toggle back and forth between the table view and plot view. Since this is Browse Diseases mode, the table shows the associated targets. All of the fields in the table are sortable and when a user clicks on a row, the detailed view appears (lower left). The screenshot in the lower right shows how a user can explore TIN-X’s predicted association between the TLR-4 receptor and bacterial sepsis. Clicking on a publication row in the list shows the abstract and presents the user with an external link to access the article on PubMed. (B) This panel shows the new table-view feature within the Browse Targets mode, where associated diseases can be sorted ascending or descending by any of the fields. In both Browse Targets mode (shown) and Browse Disease mode, the User can filter the table contents by beginning to type search/filter strings into the Search field, and all non-matching rows disappear. This feature is useful for finding one desired target or disease among the many results in the table. Like the Browse Diseases mode in Panel A, clicking on a row in Browse Targets mode reveals details about the target along with associated PubMed publications.
Figure 4
Figure 4. TIN-X integration with pharos.
This screenshot shows TIN-X integration within Pharos. On the left, text-mined target-disease associations from TIN-X are shown in an interactive scatterplot next to the circular treemap. This example depicts the relationship between CACNA1A and hereditary ataxias. The circular treemap groups the associations based on the hierarchy defined by Disease Ontology. Selecting a circle (group of diseases) or a point (individual disease) in the right panel highlights corresponding points in the scatterplot on the left.
Figure 5
Figure 5. TIN-X Use-case: Parkinson’s disease and ATP10B.
(A) Targets associated with Parkinson’s disease (PD) are depicted in Table-View mode and are initially ordered by the NDS ranking of importance and novelty scores. The highlighted row corresponds to the understudied T_dark target ATP10B, which is the subject of further exploration in TIN-X. (B) Targets associated with PD are shown in Plot-view mode on a log(Importance) vs. log(Novelty) scatterplot, with colors indicating Target Development Level (TDL) and shapes corresponding to IDG families. Hovering over a target of interest, ATP10B, reveals further details about this target. (C) Clicking on the ATP10B datapoint in the TIN-X scatterplot or, alternatively, clicking the ATP10B row in the table brings up the detailed view of ATP10B and PD. The articles responsible for TIN-X’s predicted association between ATP10B and PD are displayed, with the article of interest highlighted.

References

    1. Cannon DC, Yang JJ, Mathias SL, Ursu O, Mani S, Waller A, Schürer SC, Jensen LJ, Sklar LA, Bologa CG, Oprea TI. TIN-X: target importance and novelty explorer. Bioinformatics. 2017;33(16):2601–2603. doi: 10.1093/bioinformatics/btx200. - DOI - PMC - PubMed
    1. Grishman R, Sundheim B. Message understanding conference-6: a brief history. 1996. https://aclanthology.org/C96-1079.pdf. [4 May 2023]. https://aclanthology.org/C96-1079.pdf
    1. Grissa D, Junge A, Oprea TI, Jensen LJ. Diseases 2.0: a weekly updated database of disease-gene associations from text mining and data integration. Database: The Journal of Biological Databases and Curation. 2022;2022:baac019. doi: 10.1093/database/baac019. - DOI - PMC - PubMed
    1. Hunter L, Cohen KB. Biomedical language processing: what’s beyond PubMed? Molecular Cell. 2006;21(5):589–594. doi: 10.1016/j.molcel.2006.02.012. - DOI - PMC - PubMed
    1. Kalia LV, Lang AE. Parkinson’s disease. The Lancet. 2015;386(9996):896–912. doi: 10.1016/S0140-6736(14)61393-3. - DOI - PubMed