Adaptive semantic tag mining from heterogeneous clinical research texts
- PMID: 25327613
- PMCID: PMC4441820
- DOI: 10.3414/ME13-01-0130
Adaptive semantic tag mining from heterogeneous clinical research texts
Abstract
Objectives: To develop an adaptive approach to mine frequent semantic tags (FSTs) from heterogeneous clinical research texts.
Methods: We develop a "plug-n-play" framework that integrates replaceable unsupervised kernel algorithms with formatting, functional, and utility wrappers for FST mining. Temporal information identification and semantic equivalence detection were two example functional wrappers. We first compared this approach's recall and efficiency for mining FSTs from ClinicalTrials.gov to that of a recently published tag-mining algorithm. Then we assessed this approach's adaptability to two other types of clinical research texts: clinical data requests and clinical trial protocols, by comparing the prevalence trends of FSTs across three texts.
Results: Our approach increased the average recall and speed by 12.8% and 47.02% respectively upon the baseline when mining FSTs from ClinicalTrials.gov, and maintained an overlap in relevant FSTs with the base- line ranging between 76.9% and 100% for varying FST frequency thresholds. The FSTs saturated when the data size reached 200 documents. Consistent trends in the prevalence of FST were observed across the three texts as the data size or frequency threshold changed.
Conclusions: This paper contributes an adaptive tag-mining framework that is scalable and adaptable without sacrificing its recall. This component-based architectural design can be potentially generalizable to improve the adaptability of other clinical text mining methods.
Keywords: Medical informatics; clinical trials; component-based architecture; semantic tags; text mining.
Figures


Similar articles
-
A semantic relationship mining method among disorders, genes, and drugs from different biomedical datasets.BMC Med Inform Decis Mak. 2020 Dec 14;20(Suppl 4):283. doi: 10.1186/s12911-020-01274-z. BMC Med Inform Decis Mak. 2020. PMID: 33317518 Free PMC article.
-
Unsupervised mining of frequent tags for clinical eligibility text indexing.J Biomed Inform. 2013 Dec;46(6):1145-51. doi: 10.1016/j.jbi.2013.08.012. Epub 2013 Sep 10. J Biomed Inform. 2013. PMID: 24036004 Free PMC article.
-
Large scale biomedical texts classification: a kNN and an ESA-based approaches.J Biomed Semantics. 2016 Jun 16;7:40. doi: 10.1186/s13326-016-0073-1. J Biomed Semantics. 2016. PMID: 27312781 Free PMC article.
-
Semantic annotation in biomedicine: the current landscape.J Biomed Semantics. 2017 Sep 22;8(1):44. doi: 10.1186/s13326-017-0153-x. J Biomed Semantics. 2017. PMID: 28938912 Free PMC article. Review.
-
Formal Medical Knowledge Representation Supports Deep Learning Algorithms, Bioinformatics Pipelines, Genomics Data Analysis, and Big Data Processes.Yearb Med Inform. 2019 Aug;28(1):152-155. doi: 10.1055/s-0039-1677933. Epub 2019 Aug 16. Yearb Med Inform. 2019. PMID: 31419827 Free PMC article. Review.
Cited by
-
Clinical Research Informatics: Supporting the Research Study Lifecycle.Yearb Med Inform. 2017 Aug;26(1):193-200. doi: 10.15265/IY-2017-022. Epub 2017 Sep 11. Yearb Med Inform. 2017. PMID: 29063565 Free PMC article. Review.
-
Valx: A System for Extracting and Structuring Numeric Lab Test Comparison Statements from Text.Methods Inf Med. 2016 May 17;55(3):266-75. doi: 10.3414/ME15-01-0112. Epub 2016 Mar 4. Methods Inf Med. 2016. PMID: 26940748 Free PMC article.
-
A bibliometric analysis of natural language processing in medical research.BMC Med Inform Decis Mak. 2018 Mar 22;18(Suppl 1):14. doi: 10.1186/s12911-018-0594-x. BMC Med Inform Decis Mak. 2018. PMID: 29589569 Free PMC article.
-
Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review.J Biomed Inform. 2017 Sep;73:14-29. doi: 10.1016/j.jbi.2017.07.012. Epub 2017 Jul 17. J Biomed Inform. 2017. PMID: 28729030 Free PMC article.
References
-
- López-Paz D, Hernández-Lobato JM, Schölkopf B. Semi-Supervised Domain Adaptation with Non-Parametric Copulas. In: Bartlett PL, et al., editors. NIPS. 2012. pp. 674–682.
-
- Tarvainen P. Adaptability Evaluation of Software Architectures; A Case Study. 31st Annual International Computer Software and Applications Conference; 2007.
-
- Benveniste A, Metivier M, Priouret P. Adaptive Algorithms and Stochastic Approximations. Springer Publishing Company, Incorporated; 2012. p. 376.
-
- Flora S Tsai, A TK, Wenyin HS Tang, Kap Luk Chan. Adaptable Services for Novelty Mining. Systems and Service-Oriented Engineering. 2010;1(2):17.
-
- Xu Q, Quan Y, Yang L, He J. An adaptive algorithm for the determination of the onset and offset of muscle contraction by EMG signal processing. IEEE Trans Neural Syst Rehabil Eng. 2013;21(1):65–73. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous