Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Sep 3:6:2374289519873088.
doi: 10.1177/2374289519873088. eCollection 2019 Jan-Dec.

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Affiliations
Review

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Hooman H Rashidi et al. Acad Pathol. .

Abstract

Increased interest in the opportunities provided by artificial intelligence and machine learning has spawned a new field of health-care research. The new tools under development are targeting many aspects of medical practice, including changes to the practice of pathology and laboratory medicine. Optimal design in these powerful tools requires cross-disciplinary literacy, including basic knowledge and understanding of critical concepts that have traditionally been unfamiliar to pathologists and laboratorians. This review provides definitions and basic knowledge of machine learning categories (supervised, unsupervised, and reinforcement learning), introduces the underlying concept of the bias-variance trade-off as an important foundation in supervised machine learning, and discusses approaches to the supervised machine learning study design along with an overview and description of common supervised machine learning algorithms (linear regression, logistic regression, Naive Bayes, k-nearest neighbor, support vector machine, random forest, convolutional neural networks).

Keywords: algorithms; artificial intelligence; convolutional neural network; deep learning; k-nearest neighbor; machine learning; random forest; supervised learning; supervised methods; support vector machine; unsupervised learning.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

Figure 1.
Figure 1.
Overview diagram of machine learning algorithms. Machine learning is a subset of artificial intelligence. This figure illustrates the hierarchy of different machine learning algorithms including supervised versus unsupervised versus reinforcement learning techniques. The 2 major categories of supervised learning are classification and regression which lead to discrete/qualitative and continuous/quantitative targets, respectively.
Figure 2.
Figure 2.
Bias-variance trade-off in machine learning. This figure illustrates the trade-off between bias and variance. Training data (green line) often do not completely represent results from the testing phase. Underfitting data are less variable but exhibit a high error rate and high bias (blue box). In contrast, overfitting data result in low bias and high variance (yellow box). The ideal zone lies between over- versus underfitting of data and may not be optimal until several attempts at testing have been made (red line).
Figure 3.
Figure 3.
Comparison of popular supervised learning methodologies. This figure illustrates a variety of popular supervised machine learning (ML) methodologies. In the top row, linear regression, logistic regression, and Naïve Bayes Classifier (via TensorFlow) are shown. In the second row, k-nearest neighbor (k-NN), the ensemble decision tree algorithm random forest (RF), and support vector machine (SVM) are compared. Finally, the bottom row illustrates a convoluted neural network evaluating an image. Each image pixel is evaluated (input layer). The network contains several “hidden layers” (yellow circles) which is then processed and sent to the output layer (green circles).
Figure 4.
Figure 4.
Supervised (labeled) machine learning model study design overview. Steps for the deployment of a supervised machine learning model. From left to right, the figure shows the initial team of multidisciplinary experts defining a study design to address a need. Data are then collected, processed, trained tested, validated, and ultimately deployed.
Figure 5.
Figure 5.
Stepwise considerations for development and validation of the machine learning (ML) model. The figure describes a very general stepwise approach for development and validation of an ML model. Common metrics used in each step are shown on the right. Step 1 involves assessing the quality and accessibility of the data, followed by step 2 that requires method validation to identify optimal ML model(s). Once optimal ML models have been identified, step 3 involves determining their ability to work with other data sets to assess generalizability. Finally, step 4 involves evaluating the data in more “real-world” conditions to further assess performance and generalizability along with further refinement (go back to step 2) to improve the performance and desirable outcomes.

Similar articles

Cited by

References

    1. EMC Digital Universe. IDC Vertical Industry Brief. The digital universe driving data growth in health care; challenges and opportunities for it 2014. Vertical Industry Brief https://www.emc.com/analyst-report/digital-universe-healthcare-vertical-....Accessed December 29, 2015.
    1. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3:210–229.
    1. Koza JR, Bennett FH, III, Andre D, Keane MA. Automated design of both the topology and sizing of analog electrical circuits using genetic programming In: Gero JS, Fay S, eds. Artificial Intelligence in Design ‘96. Berlin, Germany: Springer; 1996:151–170.
    1. The Center for Devices and Radiological Health (CDRH). Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD)—Discussion Paper and Request for Feedback. In: Food and Drug Administration (FDA) 2019.
    1. Becich MJ. Information management: moving from test results to clinical information. Clin Leadersh Manag Rev. 2000;14:296–300. - PubMed