Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 17:10:1095545.
doi: 10.3389/fsurg.2023.1095545. eCollection 2023.

A machine learning-based model for predicting the risk of early-stage inguinal lymph node metastases in patients with squamous cell carcinoma of the penis

Affiliations

A machine learning-based model for predicting the risk of early-stage inguinal lymph node metastases in patients with squamous cell carcinoma of the penis

Li Ding et al. Front Surg. .

Abstract

Objective: Inguinal lymph node metastasis (ILNM) is significantly associated with poor prognosis in patients with squamous cell carcinoma of the penis (SCCP). Patient prognosis could be improved if the probability of ILNM incidence could be accurately predicted at an early stage. We developed a predictive model based on machine learning combined with big data to achieve this.

Methods: Data of patients diagnosed with SCCP were obtained from the Surveillance, Epidemiology, and End Results Program Research Data. By combing variables that represented the patients' clinical characteristics, we applied five machine learning algorithms to create predictive models based on logistic regression, eXtreme Gradient Boosting, Random Forest, Support Vector Machine, and k-Nearest Neighbor. Model performance was evaluated by ten-fold cross-validation receiver operating characteristic curves, which were used to calculate the area under the curve of the five models for predictive accuracy. Decision curve analysis was conducted to estimate the clinical utility of the models. An external validation cohort of 74 SCCP patients was selected from the Affiliated Hospital of Xuzhou Medical University (February 2008 to March 2021).

Results: A total of 1,056 patients with SCCP from the SEER database were enrolled as the training cohort, of which 164 (15.5%) developed early-stage ILNM. In the external validation cohort, 16.2% of patients developed early-stage ILNM. Multivariate logistic regression showed that tumor grade, inguinal lymph node dissection, radiotherapy, and chemotherapy were independent predictors of early-stage ILNM risk. The model based on the eXtreme Gradient Boosting algorithm showed stable and efficient prediction performance in both the training and external validation groups.

Conclusion: The ML model based on the XGB algorithm has high predictive effectiveness and may be used to predict early-stage ILNM risk in SCCP patients. Therefore, it may show promise in clinical decision-making.

Keywords: inguinal lymph node metastases; machine learning algorithms; penis cancer; prediction model; real-world research; squamous cell carcinoma.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
The results of Spearman correlation analysis between all the variables. The heat map shows the correlation between SCCP patients’ clinical and pathological features.
Figure 2
Figure 2
Kaplan-Meier curve of cancer-specific survival in SCCP patients.
Figure 3
Figure 3
(A–F) Ten-fold cross ROC curves of five ML models in the training cohort. LR, logistic regression; XGB, eXtreme gradient boosting; RF, random forest; SVM, support vector machine; KNN, k-nearest neighbor. (G) Decision curve analysis graph showing the net benefit against threshold probabilities based on decisions from model outputs. The curves referred to as “All” represent the prediction that all the patients would progress to ILNM, and the curves referred to as “None” represent the prediction that no patients were ILNM.
Figure 4
Figure 4
The ROC curve of five models in the external validation cohort.
Figure 5
Figure 5
The XGB model was used to calculate the importance of each feature. The bar chart depicts the relative significance of the variables.

Similar articles

Cited by

References

    1. O'Sullivan B, Brierley J, Byrd D, Bosman F, Kehoe S, Kossary C, et al. The tnm classification of malignant tumours-towards common understanding and reasonable expectations. Lancet Oncol. (2017) 18:849–51. 10.1016/S1470-2045(17)30438-2 - DOI - PMC - PubMed
    1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2019. Ca Cancer J Clin. (2019) 69:7–34. 10.3322/caac.21551 - DOI - PubMed
    1. Bray F, Klint A, Gislum M, Hakulinen T, Engholm G, Tryggvadottir L, et al. Trends in survival of patients diagnosed with male genital cancers in the nordic countries 1964-2003 followed up until the end of 2006. Acta Oncol. (2010) 49:644–54. 10.3109/02841860903575315 - DOI - PubMed
    1. Backes DM, Kurman RJ, Pimenta JM, Smith JS. Systematic review of human papillomavirus prevalence in invasive penile cancer. Cancer Causes Control. (2009) 20:449–57. 10.1007/s10552-008-9276-9 - DOI - PubMed
    1. Coelho R, Pinho JD, Moreno JS, Garbis D, Do NA, Larges JS, et al. Penile cancer in maranhao, northeast Brazil: the highest incidence globally? Bmc Urol. (2018) 18:50. 10.1186/s12894-018-0365-0 - DOI - PMC - PubMed

LinkOut - more resources