Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug 24;12(9):1359.
doi: 10.3390/jpm12091359.

Artificial Intelligence-Based Medical Data Mining

Affiliations
Review

Artificial Intelligence-Based Medical Data Mining

Amjad Zia et al. J Pers Med. .

Abstract

Understanding published unstructured textual data using traditional text mining approaches and tools is becoming a challenging issue due to the rapid increase in electronic open-source publications. The application of data mining techniques in the medical sciences is an emerging trend; however, traditional text-mining approaches are insufficient to cope with the current upsurge in the volume of published data. Therefore, artificial intelligence-based text mining tools are being developed and used to process large volumes of data and to explore the hidden features and correlations in the data. This review provides a clear-cut and insightful understanding of how artificial intelligence-based data-mining technology is being used to analyze medical data. We also describe a standard process of data mining based on CRISP-DM (Cross-Industry Standard Process for Data Mining) and the most common tools/libraries available for each step of medical data mining.

Keywords: artificial intelligence; healthcare information; machine learning; medical data; text mining.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Cross-Industry Standard Process for Data Mining (CRISP-DM)—adapted from the webpage of the Data Science Process Alliance [17] (www.datascience-pm.com/crisp-dm-2/, accessed on 16 April 2022). The circular nature of the data mining process is symbolized by the outer circle, while the arrows that connect the phases show the most essential and common dependencies.
Figure 2
Figure 2
Comparison between web crawling and web scraping.
Figure 3
Figure 3
Layout for access restrictions.
Figure 4
Figure 4
Steps for data cleaning.
Figure 5
Figure 5
Predictive and descriptive data mining tasks.

References

    1. Sumathy K.L., Chidambaram M. Text Mining: Concepts, Applications, Tools and Issues—An Overview. Int. J. Comput. Appl. 2013;80:29–32. doi: 10.5120/13851-1685. - DOI
    1. Cios K.J., Moore G.W. Uniqueness of medical data mining. Artif. Intell. Med. 2002;26:1–24. doi: 10.1016/S0933-3657(02)00049-0. - DOI - PubMed
    1. Yang Y., Li R., Xiang Y., Lin D., Yan A., Chen W., Li Z., Lai W., Wu X., Wan C., et al. Standardization of Collection, Storage, Annotation, and Management of Data Related to Medical Artificial Intelligence. Intell. Med. 2021 doi: 10.1016/j.imed.2021.11.002. - DOI
    1. Thorpe J.H., Gray E.A. Big data and public health: Navigating privacy laws to maximize potential. Public Health Rep. 2015;130:171–175. doi: 10.1177/003335491513000211. - DOI - PMC - PubMed
    1. McGuire A.L., Beskow L.M. Informed consent in genomics and genetic research. Annu. Rev. Genom. Hum. Genet. 2010;11:361–381. doi: 10.1146/annurev-genom-082509-141711. - DOI - PMC - PubMed

LinkOut - more resources