Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 12;12(1):7885.
doi: 10.1038/s41598-022-12093-9.

Hierarchical network analysis of co-occurring bioentities in literature

Affiliations

Hierarchical network analysis of co-occurring bioentities in literature

Heejung Yang et al. Sci Rep. .

Abstract

Biomedical databases grow by more than a thousand new publications every day. The large volume of biomedical literature that is being published at an unprecedented rate hinders the discovery of relevant knowledge from keywords of interest to gather new insights and form hypotheses. A text-mining tool, PubTator, helps to automatically annotate bioentities, such as species, chemicals, genes, and diseases, from PubMed abstracts and full-text articles. However, the manual re-organization and analysis of bioentities is a non-trivial and highly time-consuming task. ChexMix was designed to extract the unique identifiers of bioentities from query results. Herein, ChexMix was used to construct a taxonomic tree with allied species among Korean native plants and to extract the medical subject headings unique identifier of the bioentities, which co-occurred with the keywords in the same literature. ChexMix discovered the allied species related to a keyword of interest and experimentally proved its usefulness for multi-species analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Network and hierarchical tree of biomedicals using ChexMix.
Figure 2
Figure 2
(A) The recommendation process of Korean native plants related to the query keyword using ChexMix. (B) Network obtained by entering ‘amentoflavone’ as input keyword in ChexMix. The unique identifiers (TaxID, pale green nodes) for species co-existing with the input keyword in the literature are linked to their own taxonomic higher rank (genus, sky blue color). Orange nodes represent species names that only existed in the list of Korean medicinal plants of the KPEB and are linked to the nodes for genus to which each species belongs. (C) Detailed subnetwork under the Viburnum genus. Each node was displayed as ‘ID: name’ for TaxID and genus or species name. The networks were drawn by Gephi software (ver. 0.9.2, https://gephi.org/).
Figure 3
Figure 3
(A) Chemical structure of amentoflavone. (B) Chromatograms of the five samples with the highest amentoflavone content determined as described in the “Methods” section. AMEN, amentoflavone; VCL, leaves of Viburnum carlesii; VDF, fruits of V. furcatum; VDSt, leaves of V. dilatatum; VEL, leaves of V. erosum; VESt, stems of V. erosum.
Figure 4
Figure 4
(A) Acquired network using ‘taxus cuspidata’ and ‘Podophyllum peltatum’ as input keywords in ChexMix. MeSH terms co-occurring in the literature with the input keywords were reorganized according to the hierarchy rules of the MeSH Tree Structures in the MeSH browser (https://meshb-prev.nlm.nih.gov/treeView). The nodes of the co-occurred bioentities in both keywords are colored in orange. (B) Details of the subnetwork of the co-occurred bioentities in both keywords. Each node displays as ‘Tree Number: MeSH Heading’ for MeSH identifiers and a MeSH term. The networks were drawn by Gephi software (ver. 0.9.2, https://gephi.org/).

References

    1. Mendez D, et al. ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res. 2019;47:D930–D940. doi: 10.1093/nar/gky1075. - DOI - PMC - PubMed
    1. Gilson MK, et al. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2016;44:D1045–D1053. doi: 10.1093/nar/gkv1072. - DOI - PMC - PubMed
    1. Wassermann AM, Bajorath J. BindingDB and ChEMBL: Online compound databases for drug discovery. Expert Opin. Drug Discov. 2011;6:683–687. doi: 10.1517/17460441.2011.579100. - DOI - PubMed
    1. Wishart DS, et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46:D1074–D1082. doi: 10.1093/nar/gkx1037. - DOI - PMC - PubMed
    1. Davis AP, et al. The comparative toxicogenomics database: Update 2019. Nucleic Acids Res. 2019;47:D948–D954. doi: 10.1093/nar/gky868. - DOI - PMC - PubMed

Publication types