Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Oct 26;21(Suppl 5):250.
doi: 10.1186/s12859-020-3396-y.

Literature mining for context-specific molecular relations using multimodal representations (COMMODAR)

Affiliations
Review

Literature mining for context-specific molecular relations using multimodal representations (COMMODAR)

Jaehyun Lee et al. BMC Bioinformatics. .

Abstract

Biological contextual information helps understand various phenomena occurring in the biological systems consisting of complex molecular relations. The construction of context-specific relational resources vastly relies on laborious manual extraction from unstructured literature. In this paper, we propose COMMODAR, a machine learning-based literature mining framework for context-specific molecular relations using multimodal representations. The main idea of COMMODAR is the feature augmentation by the cooperation of multimodal representations for relation extraction. We leveraged biomedical domain knowledge as well as canonical linguistic information for more comprehensive representations of textual sources. The models based on multiple modalities outperformed those solely based on the linguistic modality. We applied COMMODAR to the 14 million PubMed abstracts and extracted 9214 context-specific molecular relations. All corpora, extracted data, evaluation results, and the implementation code are downloadable at https://github.com/jae-hyun-lee/commodar . CCS CONCEPTS: • Computing methodologies~Information extraction • Computing methodologies~Neural networks • Applied computing~Biological networks.

Keywords: Biological context; Literature mining; Natural language processing; Representation learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The overall process of COMMODAR
Fig. 2
Fig. 2
The architecture of MGNC-CNN
Fig. 3
Fig. 3
Performance comparison across various embedding combinations and filter sizes

References

    1. Topol EJ. Individualized medicine from prewomb to tomb. Cell. 2014;157(1):241–253. doi: 10.1016/j.cell.2014.02.012. - DOI - PMC - PubMed
    1. Yoon S, et al. Context-based resolution of semantic conflicts in biological pathways. BMC Med Inform Decis Mak. 2015;15(1):S3. doi: 10.1186/1472-6947-15-S1-S3. - DOI - PMC - PubMed
    1. Mosca R, et al. dSysMap: exploring the edgetic role of disease mutations. Nat Methods. 2015;12(3):167–168. doi: 10.1038/nmeth.3289. - DOI - PubMed
    1. Lu H-C, Herrera Braga J, Fraternali F. PinSnps: structural and functional analysis of SNPs in the context of protein interaction networks. Bioinformatics. 2016;32(16):2534–2536. doi: 10.1093/bioinformatics/btw153. - DOI - PMC - PubMed
    1. Higueruelo AP, Jubb H, Blundell TL. TIMBAL v2: update of a database holding small molecules modulating protein–protein interactions. Database. 2013;2013:bat039. doi: 10.1093/database/bat039. - DOI - PMC - PubMed

LinkOut - more resources