Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 3:13:1160383.
doi: 10.3389/fonc.2023.1160383. eCollection 2023.

Machine learning models-based on integration of next-generation sequencing testing and tumor cell sizes improve subtype classification of mature B-cell neoplasms

Affiliations

Machine learning models-based on integration of next-generation sequencing testing and tumor cell sizes improve subtype classification of mature B-cell neoplasms

Yafei Mu et al. Front Oncol. .

Abstract

Background: Next-generation sequencing (NGS) panels for mature B-cell neoplasms (MBNs) are widely applied clinically but have yet to be routinely used in a manner that is suitable for subtype differential diagnosis. This study retrospectively investigated newly diagnosed cases of MBNs from our laboratory to investigate mutation landscapes in Chinese patients with MBNs and to combine mutational information and machine learning (ML) into clinical applications for MBNs, especially for subtype classification.

Methods: Samples from the Catalogue Of Somatic Mutations In Cancer (COSMIC) database were collected for ML model construction and cases from our laboratory were used for ML model validation. Five repeats of 10-fold cross-validation Random Forest algorithm was used for ML model construction. Mutation detection was performed by NGS and tumor cell size was confirmed by cell morphology and/or flow cytometry in our laboratory.

Results: Totally 849 newly diagnosed MBN cases from our laboratory were retrospectively identified and included in mutational landscape analyses. Patterns of gene mutations in a variety of MBN subtypes were found, important to investigate tumorigenesis in MBNs. A long list of novel mutations was revealed, valuable to both functional studies and clinical applications. By combining gene mutation information revealed by NGS and ML, we established ML models that provide valuable information for MBN subtype classification. In total, 8895 cases of 8 subtypes of MBNs in the COSMIC database were collected and utilized for ML model construction, and the models were validated on the 849 MBN cases from our laboratory. A series of ML models was constructed in this study, and the most efficient model, with an accuracy of 0.87, was based on integration of NGS testing and tumor cell sizes.

Conclusions: The ML models were of great significance in the differential diagnosis of all cases and different MBN subtypes. Additionally, using NGS results to assist in subtype classification of MBNs by method of ML has positive clinical potential.

Keywords: machine learning (ML); mature B-cell neoplasms (MBNs); next-generation sequencing (NGS); pathological diagnosis; subtype classification.

PubMed Disclaimer

Conflict of interest statement

Authors YFM, YHM, TC, XF, JY, JL, GL, and SY are employed by the company Guangzhou KingMed Transformative Medicine Institute Co., Ltd., Guangzhou, China. Authors YC, JP, JF, KD, and SY are employed by the company Guangzhou KingMed Center for Clinical Laboratory Co., Ltd., Guangzhou, China. Authors YC, YM, YL, and SY are employed by the company Guangzhou KingMed Diagnostics Group Co., Ltd., Guangzhou, China. The remaining author declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Grouping procedure and mutation landscape of 849 cases of mature B-cell neoplasms (MBNs). (A) Four groups of 849 MBN cases subcategorized into initial diagnosis and comprehensive diagnosis. (B) The mutation landscape of the top 20 genes detected in 849 MBN cases. (C) Localization and frequencies of 33 KMT2D variants in 26 CLL/SLL cases. *: Stop codon.
Figure 2
Figure 2
Construction and internal validation of machine learning (ML) models based on the COSMIC database. (A) Model feature selection in COSMIC IA. When the model feature number was 30 in COSMIC IA, the model had the highest efficiency (COSMIC IB). (B) Model feature selection in COSMIC IIA. When the model feature number was 16 in COSMIC IIA, the model had the highest efficiency (COSMIC IIB). (C) Model accuracy of COSMIC I (COSMIC IA and IB) and COSMIC II (COSMIC IIA and IIB) in internal validation.
Figure 3
Figure 3
External validation of machine learning (ML) models based on local cohort. (A) Model accuracy of COSMIC II (COSMIC IIA and IIB) in each case group. (B) Model accuracy of COSMIC II (COSMIC IIA and IIB) by subtype in all cases. (C) Model accuracy of COSMIC II (COSMIC IIA and IIB) by subtype in typical cases (Group A). (D) Model accuracy of COSMIC II (COSMIC IIA and IIB) by subtype in refined cases (Group B1). (E) Model accuracy of COSMIC II (COSMIC IIA and IIB) by subtype in further-diagnosed cases (Group B2).
Figure 4
Figure 4
The proportion of cases incorrectly predicted by COSMIC IIB within each subtype of mature B-cell neoplasms (MBNs) based on comprehensive diagnosis. *: Number of incorrectly predicted cases (Incorrectly predicted rate, Total number of predicted cases).

Similar articles

References

    1. Lumish M, Falchi L, Imber BS, Scordo M, von Keudell G, Joffe E. How we treat mature b-cell neoplasms (indolent b-cell lymphomas). J Hematol Oncol (2021) 14(1):5. doi: 10.1186/s13045-020-01018-6 - DOI - PMC - PubMed
    1. Swerdlow SH, Campo E, Pileri SA, Harris NL, Stein H, Siebert R, et al. . The 2016 revision of the world health organization classification of lymphoid neoplasms. Blood (2016) 127(20):2375–90. doi: 10.1182/blood-2016-01-643569 - DOI - PMC - PubMed
    1. Seegmiller AC, Hsi ED, Craig FE. The current role of clinical flow cytometry in the evaluation of mature b-cell neoplasms. Cytomet B Clin Cytom (2019) 96(1):20–9. doi: 10.1002/cyto.b.21756 - DOI - PubMed
    1. Bogusz AM, Bagg A. Genetic aberrations in small b-cell lymphomas and leukemias: molecular pathology, clinical relevance and therapeutic targets. Leuk Lymphoma (2016) 57(9):1991–2013. doi: 10.3109/10428194.2016.1173212 - DOI - PubMed
    1. Mosquera Orgueira A, Cid López M, Peleteiro Raíndo A, Díaz AJÁ, Antelo Rodríguez B, Bao Pérez L, et al. . Detection of rare germline variants in the genomes of patients with b-cell neoplasms. Cancers (Basel) (2021) 13(6):1340. doi: 10.3390/cancers13061340 - DOI - PMC - PubMed