Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Jan 22;35(1):1-19.
doi: 10.1101/gr.278413.123.

Artificial intelligence and machine learning in cell-free-DNA-based diagnostics

Affiliations
Review

Artificial intelligence and machine learning in cell-free-DNA-based diagnostics

W H Adrian Tsui et al. Genome Res. .

Abstract

The discovery of circulating fetal and tumor cell-free DNA (cfDNA) molecules in plasma has opened up tremendous opportunities in noninvasive diagnostics such as the detection of fetal chromosomal aneuploidies and cancers and in posttransplantation monitoring. The advent of high-throughput sequencing technologies makes it possible to scrutinize the characteristics of cfDNA molecules, opening up the fields of cfDNA genetics, epigenetics, transcriptomics, and fragmentomics, providing a plethora of biomarkers. Machine learning (ML) and/or artificial intelligence (AI) technologies that are known for their ability to integrate high-dimensional features have recently been applied to the field of liquid biopsy. In this review, we highlight various AI and ML approaches in cfDNA-based diagnostics. We first introduce the biology of cell-free DNA and basic concepts of ML and AI technologies. We then discuss selected examples of ML- or AI-based applications in noninvasive prenatal testing and cancer liquid biopsy. These applications include the deduction of fetal DNA fraction, plasma DNA tissue mapping, and cancer detection and localization. Finally, we offer perspectives on the future direction of using ML and AI technologies to leverage cfDNA fragmentation patterns in terms of methylomic and transcriptional investigations.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
ML algorithms exploit cfDNA features for applications in NIPT, cancer liquid biopsies, and emerging areas of cfDNA biology. (Upper left) SeqFF uses elastic net to process local coverage in 50 kb genome-wide bins for fetal fraction estimation. (Upper right) Natera uses DNN to process linkage information between SNPs for microdeletion detection. (Middle left) SVM was used to detect lung cancer using the methylation status of selected differentially methylated regions (DMRs). (Middle right) The DELFI algorithm feeds local size and coverage profiles into a gradient boosting model to achieve multicancer detection. (Lower left) In one implementation of FRAGMA, a CNN is used to analyze the cleavage around a differentially methylated CpG site to determine CpG methylation status. (Lower right) NMF deconvolution of the frequencies of 256 5′ 4-mer end motifs yields “founder” profiles of potential biological significance.
Figure 2.
Figure 2.
An example of AI-based technology for direct methylation analysis and its clinical applications on the basis of the holistic kinetic (HK) model. (Left) The principle of single-molecule real-time (SMRT) sequencing. Circular DNA templates are incorporated with nucleotides labeled with different fluorophores by a DNA polymerase located in a zero-mode waveguide (ZMW). During DNA polymerization, the kinetics of nucleotide incorporation, including inter-pulse duration (IPD) and pulse width (PW), are affected by base modifications. (Middle) The HK model, an AI-based method that employs a convolutional neural network (CNN). This model is trained using combined kinetic signals and sequence context from a large amount of measurement windows and is applied to the prediction of cytosine methylation status. The methylation probability for the CpG sites, ranging from zero to one, is computed using a sigmoid function at the output layer. (Right) Selected clinical applications of the HK model: (1) deducing the placenta-derived cfDNA by the methylation pattern of long cfDNA molecules, opening up possibilities of developing generic approaches for monogenetic diseases, and (2) detecting patients with cancers and determining the tumor origin of cancer (e.g., hepatocellular carcinoma [HCC]) according to methylation patterns of long cfDNA determined by the HK model.
Figure 3.
Figure 3.
A timeline of applications of AI/ML algorithms to cfDNA analyses. Categories, from top to bottom: (orange) tissue-of-origin analyses and cancer liquid biopsies; (blue) noninvasive prenatal testing, including fetal fraction estimations; (green) fragmentomic-based methylomic and transcriptomic analyses; and (purple) multimodal cfDNA analyses, which leverage different cfDNA features in an integrated model. The name or a brief description of the technology is listed in bold, with the AI/ML algorithms used given below.

Similar articles

Cited by

References

    1. Abbosh C, Frankell AM, Harrison T, Kisistok J, Garnett A, Johnson L, Veeriah S, Moreau M, Chesh A, Chaunzwa TL, et al. 2023. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature 616: 553–562. 10.1038/s41586-023-05776-4 - DOI - PMC - PubMed
    1. Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, et al. 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630: 493–500. 10.1038/s41586-024-07487-w - DOI - PMC - PubMed
    1. Abu-Mostafa YS, Magdon-Ismail M, Lin H-T. 2012. Learning from data. AMLBook, New York.
    1. Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. 2022. Multimodal biomedical AI. Nat Med 28: 1773–1784. 10.1038/s41591-022-01981-2 - DOI - PubMed
    1. Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, Boot A, Covington KR, Gordenin DA, Bergstrom EN, et al. 2020. The repertoire of mutational signatures in human cancer. Nature 578: 94–101. 10.1038/s41586-020-1943-3 - DOI - PMC - PubMed

Substances

LinkOut - more resources