Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Hooman H Rashidi¹, Nam K Tran¹, Elham Vali Betts¹, Lydia P Howell¹, Ralph Green¹

Affiliations

PMID: 31523704
PMCID: PMC6727099
DOI: 10.1177/2374289519873088

Review

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Hooman H Rashidi et al. Acad Pathol. 2019.

. 2019 Sep 3:6:2374289519873088.

doi: 10.1177/2374289519873088. eCollection 2019 Jan-Dec.

Authors

Hooman H Rashidi¹, Nam K Tran¹, Elham Vali Betts¹, Lydia P Howell¹, Ralph Green¹

Affiliation

¹ Department of Pathology and Laboratory Medicine, University of California Davis, School of Medicine, Davis, CA, USA.

PMID: 31523704
PMCID: PMC6727099
DOI: 10.1177/2374289519873088

Abstract

Increased interest in the opportunities provided by artificial intelligence and machine learning has spawned a new field of health-care research. The new tools under development are targeting many aspects of medical practice, including changes to the practice of pathology and laboratory medicine. Optimal design in these powerful tools requires cross-disciplinary literacy, including basic knowledge and understanding of critical concepts that have traditionally been unfamiliar to pathologists and laboratorians. This review provides definitions and basic knowledge of machine learning categories (supervised, unsupervised, and reinforcement learning), introduces the underlying concept of the bias-variance trade-off as an important foundation in supervised machine learning, and discusses approaches to the supervised machine learning study design along with an overview and description of common supervised machine learning algorithms (linear regression, logistic regression, Naive Bayes, k-nearest neighbor, support vector machine, random forest, convolutional neural networks).

Keywords: algorithms; artificial intelligence; convolutional neural network; deep learning; k-nearest neighbor; machine learning; random forest; supervised learning; supervised methods; support vector machine; unsupervised learning.

PubMed Disclaimer

Conflict of interest statement

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figures

**Figure 1.**
Overview diagram of machine learning algorithms. Machine learning is a subset of artificial intelligence. This figure illustrates the hierarchy of different machine learning algorithms including supervised versus unsupervised versus reinforcement learning techniques. The 2 major categories of supervised learning are classification and regression which lead to discrete/qualitative and continuous/quantitative targets, respectively.

**Figure 2.**
Bias-variance trade-off in machine learning. This figure illustrates the trade-off between bias and variance. Training data (green line) often do not completely represent results from the testing phase. Underfitting data are less variable but exhibit a high error rate and high bias (blue box). In contrast, overfitting data result in low bias and high variance (yellow box). The ideal zone lies between over- versus underfitting of data and may not be optimal until several attempts at testing have been made (red line).

**Figure 3.**
Comparison of popular supervised learning methodologies. This figure illustrates a variety of popular supervised machine learning (ML) methodologies. In the top row, linear regression, logistic regression, and Naïve Bayes Classifier (via TensorFlow) are shown. In the second row, k-nearest neighbor (k-NN), the ensemble decision tree algorithm random forest (RF), and support vector machine (SVM) are compared. Finally, the bottom row illustrates a convoluted neural network evaluating an image. Each image pixel is evaluated (input layer). The network contains several “hidden layers” (yellow circles) which is then processed and sent to the output layer (green circles).

**Figure 4.**
Supervised (labeled) machine learning model study design overview. Steps for the deployment of a supervised machine learning model. From left to right, the figure shows the initial team of multidisciplinary experts defining a study design to address a need. Data are then collected, processed, trained tested, validated, and ultimately deployed.

**Figure 5.**
Stepwise considerations for development and validation of the machine learning (ML) model. The figure describes a very general stepwise approach for development and validation of an ML model. Common metrics used in each step are shown on the right. Step 1 involves assessing the quality and accessibility of the data, followed by step 2 that requires method validation to identify optimal ML model(s). Once optimal ML models have been identified, step 3 involves determining their ability to work with other data sets to assess generalizability. Finally, step 4 involves evaluating the data in more “real-world” conditions to further assess performance and generalizability along with further refinement (go back to step 2) to improve the performance and desirable outcomes.

See this image and copyright information in PMC

References

1. EMC Digital Universe. IDC Vertical Industry Brief. The digital universe driving data growth in health care; challenges and opportunities for it 2014. Vertical Industry Brief https://www.emc.com/analyst-report/digital-universe-healthcare-vertical-....Accessed December 29, 2015.
1. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3:210–229.
1. Koza JR, Bennett FH, III, Andre D, Keane MA. Automated design of both the topology and sizing of analog electrical circuits using genetic programming In: Gero JS, Fay S, eds. Artificial Intelligence in Design ‘96. Berlin, Germany: Springer; 1996:151–170.
1. The Center for Devices and Radiological Health (CDRH). Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD)—Discussion Paper and Request for Feedback. In: Food and Drug Administration (FDA) 2019.
1. Becich MJ. Information management: moving from test results to clinical information. Clin Leadersh Manag Rev. 2000;14:296–300. - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Affiliation

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources