Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Feb 21;16(5):862.
doi: 10.3390/cancers16050862.

Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Affiliations
Review

Integrating Artificial Intelligence for Advancing Multiple-Cancer Early Detection via Serum Biomarkers: A Narrative Review

Hsin-Yao Wang et al. Cancers (Basel). .

Abstract

The concept and policies of multicancer early detection (MCED) have gained significant attention from governments worldwide in recent years. In the era of burgeoning artificial intelligence (AI) technology, the integration of MCED with AI has become a prevailing trend, giving rise to a plethora of MCED AI products. However, due to the heterogeneity of both the detection targets and the AI technologies, the overall diversity of MCED AI products remains considerable. The types of detection targets encompass protein biomarkers, cell-free DNA, or combinations of these biomarkers. In the development of AI models, different model training approaches are employed, including datasets of case-control studies or real-world cancer screening datasets. Various validation techniques, such as cross-validation, location-wise validation, and time-wise validation, are used. All of the factors show significant impacts on the predictive efficacy of MCED AIs. After the completion of AI model development, deploying the MCED AIs in clinical practice presents numerous challenges, including presenting the predictive reports, identifying the potential locations and types of tumors, and addressing cancer-related information, such as clinical follow-up and treatment. This study reviews several mature MCED AI products currently available in the market, detecting their composing factors from serum biomarker detection, MCED AI training/validation, and the clinical application. This review illuminates the challenges encountered by existing MCED AI products across these stages, offering insights into the continued development and obstacles within the field of MCED AI.

Keywords: AI; multi-cancer early detection (MCED); serum biomarkers.

PubMed Disclaimer

Conflict of interest statement

H.-Y. Wang has a financial interest in the licensed technology related to this work, as it has been licensed to 20/20 GeneSystems. Additionally, C. Zhou and M. Lebowitz are employed by 20/20 GeneSystems.

Figures

Figure 1
Figure 1
Illustrative scheme of MCED, serum biomarkers, and MCED AI.
Figure 2
Figure 2
Probability distribution difference between cancer cases and normal (healthy) cases in (A) the case-control study dataset and (B) the real-world cancer screening dataset. In a case-control study dataset, cancer cases and noncancer cases are well-defined at the time of enrollment. The risk score distributions of cancer and noncancer cases would be apparently different. In this case, any cutoff value in between Cutoff.1 and Cutoff.2 is fine to have a perfect predictive performance. By contrast, the risk score distributions of cancer and noncancer cases in the real-world cancer screening dataset overlap more, and the optimal diagnostic cutoff is much more narrow than those in the case-control study. The illustrative plots demonstrate the reason why the MCED AI models that are trained by using the data of case-control studies would have suboptimal predictive performance in a real-world cancer screening.
Figure 3
Figure 3
Special considerations on developing MCED AI models. (A) The paucity of cancer cases in the real-world cancer screening scenario. In a real-world cancer screening dataset, the ratio of cancer cases versus noncancer cases is typically around 1:100. The ratio is extremely unbalanced for training an AI model. The oversampling of cancer cases or the undersampling of noncancer cases are commonly used data processing methods to create a balanced dataset for AI model training. (B) Data processing for training, validating, and independently testing MCED AI models. In the undersampling strategy, stratified sampling can be used to create a balanced training dataset. By contrast, the cancer versus noncancer cases ratio is good to be kept the same as the original dataset for both validation and independent testing in order to have an accurate estimation of diagnostic metrics. Mc: cancer cases; mc: cancer cases in 5-fold split datasets; Mb: noncancer cases; mb: noncancer cases in 5-fold split datasets that are sampled from Mb by using stratified random sampling; Nc and Nb: cancer cases and noncancer cases in an independent dataset. (C) Different approaches for the independent testing of MCED AI models. Location-wise independent testing can be used to test the generalizability of an AI model across different locations. Time-wise independent testing can be used to recurrently test the robustness of an AI model in different periods of time.
Figure 4
Figure 4
Challenges for the implementation of MCED AIs. Besides cancer early detection, an MCED AI product should also provide a lot of clinically relevant information in order to successfully integrate MCED AIs into current clinical workflows of diagnosing and treating cancers.

Similar articles

Cited by

References

    1. Zutshi V., Kaur G. Remembering George Papanicolaou: A Revolutionary Who Invented the Pap Smear Test. J. Colposc. Low. Genit. Tract. Pathol. 2023;1:47–49.
    1. Siegel R.L., Miller K.D., Fuchs H.E., Jemal A. Cancer statistics, 2022. CA Cancer J. Clin. 2022;72:7–33. doi: 10.3322/caac.21708. - DOI - PubMed
    1. Vogelstein B., Kinzler K.W. The Path to Cancer—Three Strikes and You’re Out. N. Engl. J. Med. 2015;373:1895–1898. doi: 10.1056/NEJMp1508811. - DOI - PubMed
    1. Fedeli U., Barbiellini Amidei C., Han X., Jemal A. Changes in cancer-related mortality during the COVID-19 pandemic in the United States. JNCI J. Natl. Cancer Inst. 2024;116:167–169. doi: 10.1093/jnci/djad191. - DOI - PubMed
    1. Guerra C.E., Sharma P.V., Castillo B.S. Multi-Cancesr Early Detection: The New Frontier in Cancer Early Detection. Annu. Rev. Med. 2024;75:67–81. doi: 10.1146/annurev-med-050522-033624. - DOI - PubMed

LinkOut - more resources