Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Hsin-Yao Wang^{1

2

3}, Wan-Ying Lin¹, Chenfei Zhou³, Zih-Ang Yang¹, Sriram Kalpana¹, Michael S Lebowitz³

Affiliations

¹ Department of Laboratory Medicine, Linkou Chang Gung Memorial Hospital, Taoyuan 33343, Taiwan.
² School of Medicine, National Tsing Hua University, Hsinchu 300044, Taiwan.
³ 20/20 GeneSystems, Gaithersburg, MD 20877, USA.

PMID: 38473224
PMCID: PMC10931531
DOI: 10.3390/cancers16050862

Review

Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Hsin-Yao Wang et al. Cancers (Basel). 2024.

. 2024 Feb 21;16(5):862.

doi: 10.3390/cancers16050862.

Authors

Hsin-Yao Wang^{1

2

3}, Wan-Ying Lin¹, Chenfei Zhou³, Zih-Ang Yang¹, Sriram Kalpana¹, Michael S Lebowitz³

Affiliations

¹ Department of Laboratory Medicine, Linkou Chang Gung Memorial Hospital, Taoyuan 33343, Taiwan.
² School of Medicine, National Tsing Hua University, Hsinchu 300044, Taiwan.
³ 20/20 GeneSystems, Gaithersburg, MD 20877, USA.

PMID: 38473224
PMCID: PMC10931531
DOI: 10.3390/cancers16050862

Abstract

The concept and policies of multicancer early detection (MCED) have gained significant attention from governments worldwide in recent years. In the era of burgeoning artificial intelligence (AI) technology, the integration of MCED with AI has become a prevailing trend, giving rise to a plethora of MCED AI products. However, due to the heterogeneity of both the detection targets and the AI technologies, the overall diversity of MCED AI products remains considerable. The types of detection targets encompass protein biomarkers, cell-free DNA, or combinations of these biomarkers. In the development of AI models, different model training approaches are employed, including datasets of case-control studies or real-world cancer screening datasets. Various validation techniques, such as cross-validation, location-wise validation, and time-wise validation, are used. All of the factors show significant impacts on the predictive efficacy of MCED AIs. After the completion of AI model development, deploying the MCED AIs in clinical practice presents numerous challenges, including presenting the predictive reports, identifying the potential locations and types of tumors, and addressing cancer-related information, such as clinical follow-up and treatment. This study reviews several mature MCED AI products currently available in the market, detecting their composing factors from serum biomarker detection, MCED AI training/validation, and the clinical application. This review illuminates the challenges encountered by existing MCED AI products across these stages, offering insights into the continued development and obstacles within the field of MCED AI.

Keywords: AI; multi-cancer early detection (MCED); serum biomarkers.

PubMed Disclaimer

Conflict of interest statement

H.-Y. Wang has a financial interest in the licensed technology related to this work, as it has been licensed to 20/20 GeneSystems. Additionally, C. Zhou and M. Lebowitz are employed by 20/20 GeneSystems.

Figures

**Figure 1**
Illustrative scheme of MCED, serum biomarkers, and MCED AI.

**Figure 2**
Probability distribution difference between cancer cases and normal (healthy) cases in (A) the case-control study dataset and (B) the real-world cancer screening dataset. In a case-control study dataset, cancer cases and noncancer cases are well-defined at the time of enrollment. The risk score distributions of cancer and noncancer cases would be apparently different. In this case, any cutoff value in between Cutoff.1 and Cutoff.2 is fine to have a perfect predictive performance. By contrast, the risk score distributions of cancer and noncancer cases in the real-world cancer screening dataset overlap more, and the optimal diagnostic cutoff is much more narrow than those in the case-control study. The illustrative plots demonstrate the reason why the MCED AI models that are trained by using the data of case-control studies would have suboptimal predictive performance in a real-world cancer screening.

**Figure 3**
Special considerations on developing MCED AI models. (A) The paucity of cancer cases in the real-world cancer screening scenario. In a real-world cancer screening dataset, the ratio of cancer cases versus noncancer cases is typically around 1:100. The ratio is extremely unbalanced for training an AI model. The oversampling of cancer cases or the undersampling of noncancer cases are commonly used data processing methods to create a balanced dataset for AI model training. (B) Data processing for training, validating, and independently testing MCED AI models. In the undersampling strategy, stratified sampling can be used to create a balanced training dataset. By contrast, the cancer versus noncancer cases ratio is good to be kept the same as the original dataset for both validation and independent testing in order to have an accurate estimation of diagnostic metrics. M_c: cancer cases; m_c: cancer cases in 5-fold split datasets; M_b: noncancer cases; m_b: noncancer cases in 5-fold split datasets that are sampled from M_b by using stratified random sampling; N_c and N_b: cancer cases and noncancer cases in an independent dataset. (C) Different approaches for the independent testing of MCED AI models. Location-wise independent testing can be used to test the generalizability of an AI model across different locations. Time-wise independent testing can be used to recurrently test the robustness of an AI model in different periods of time.

**Figure 4**
Challenges for the implementation of MCED AIs. Besides cancer early detection, an MCED AI product should also provide a lot of clinically relevant information in order to successfully integrate MCED AIs into current clinical workflows of diagnosing and treating cancers.

See this image and copyright information in PMC

References

1. Zutshi V., Kaur G. Remembering George Papanicolaou: A Revolutionary Who Invented the Pap Smear Test. J. Colposc. Low. Genit. Tract. Pathol. 2023;1:47–49.
1. Siegel R.L., Miller K.D., Fuchs H.E., Jemal A. Cancer statistics, 2022. CA Cancer J. Clin. 2022;72:7–33. doi: 10.3322/caac.21708. - DOI - PubMed
1. Vogelstein B., Kinzler K.W. The Path to Cancer—Three Strikes and You’re Out. N. Engl. J. Med. 2015;373:1895–1898. doi: 10.1056/NEJMp1508811. - DOI - PubMed
1. Fedeli U., Barbiellini Amidei C., Han X., Jemal A. Changes in cancer-related mortality during the COVID-19 pandemic in the United States. JNCI J. Natl. Cancer Inst. 2024;116:167–169. doi: 10.1093/jnci/djad191. - DOI - PubMed
1. Guerra C.E., Sharma P.V., Castillo B.S. Multi-Cancesr Early Detection: The New Frontier in Cancer Early Detection. Annu. Rev. Med. 2024;75:67–81. doi: 10.1146/annurev-med-050522-033624. - DOI - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Affiliations

Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

LinkOut - more resources

Full Text Sources