Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 10;17(14):2305.
doi: 10.3390/cancers17142305.

Temporal Trends and Patient Stratification in Lung Cancer: A Comprehensive Clustering Analysis from Timis County, Romania

Affiliations

Temporal Trends and Patient Stratification in Lung Cancer: A Comprehensive Clustering Analysis from Timis County, Romania

Versavia Maria Ancusa et al. Cancers (Basel). .

Abstract

Background/Objectives: Lung cancer remains a major cause of cancer-related mortality, with regional differences in incidence and patient characteristics. This study aimed to verify and quantify a perceived dramatic increase in lung cancer cases at a Romanian center, identify distinct patient phenotypes using unsupervised machine learning, and characterize contributing factors, including demographic shifts, changes in the healthcare system, and geographic patterns. Methods: A comprehensive retrospective analysis of 4206 lung cancer patients admitted between 2013 and 2024 was conducted, with detailed molecular characterization of 398 patients from 2023 to 2024. Temporal trends were analyzed using statistical methods, while k-means clustering on 761 clinical features identified patient phenotypes. The geographic distribution, smoking patterns, respiratory comorbidities, and demographic factors were systematically characterized across the identified clusters. Results: We confirmed an 80.5% increase in lung cancer admissions between pre-pandemic (2013-2020) and post-pandemic (2022-2024) periods, exceeding the 51.1% increase in total hospital admissions and aligning with national Romanian trends. Five distinct patient clusters emerged: elderly never-smokers (28.9%) with the highest metastatic rates (44.3%), heavy-smoking males (27.4%), active smokers with comprehensive molecular testing (31.7%), young mixed-gender cohort (7.3%) with balanced demographics, and extreme heavy smokers (4.8%) concentrated in rural areas (52.6%) with severe comorbidity burden. Clusters demonstrated significant differences in age (p < 0.001), smoking intensity (p < 0.001), geographic distribution (p < 0.001), as well as molecular characteristics. COPD prevalence was exceptionally high (44.8-78.9%) across clusters, while COVID-19 history remained low (3.4-8.3%), suggesting a limited direct association between the pandemic and cancer. Conclusions: This study presents the first comprehensive machine learning-based stratification of lung cancer patients in Romania, confirming genuine epidemiological increases beyond healthcare system artifacts. The identification of five clinically meaningful phenotypes-particularly rural extreme smokers and age-stratified never-smokers-demonstrates the value of unsupervised clustering for regional healthcare planning. These findings establish frameworks for targeted screening programs, personalized treatment approaches, and resource allocation strategies tailored to specific high-risk populations while highlighting the potential of artificial intelligence in identifying actionable clinical patterns for the implementation of precision medicine.

Keywords: Romania; epidemiology; lung cancer; machine learning; patient clustering; personalized medicine; temporal trends.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest. The funders had no role in the study’s design, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure A1
Figure A1
Population numbers for all the administrative units in Timis County. Cut-off rate at 10,000 people. Data corresponding to the 2021 national census.
Figure A2
Figure A2
Heatmap of main features.
Figure A3
Figure A3
Elbow method.
Figure A4
Figure A4
Gap statistics with standard deviation.
Figure A5
Figure A5
Clusters for k = 4.
Figure A6
Figure A6
Clusters for k = 5.
Figure A7
Figure A7
Clusters for k = 5.
Figure A8
Figure A8
Demographic trends analysis—Timis County (2013–2024) (based on data from the National Statistics Institute).
Figure A9
Figure A9
Population pyramid—Timis County (2024) (based on data from the National Statistics Institute).
Figure A10
Figure A10
Daily tobacco smoke exposure (≥1 h): Romania vs. EU27. Data source: Eurostat Health Interview Survey (EHIS).
Figure A11
Figure A11
Smoking of tobacco products: Romania vs. EU27. Total Population = Never Smokers (NSM) + Current Smokers (SM_CUR). Current Smokers (SM_CUR) = Daily Smokers (SM_DAY) + Occasional Smokers (SM_OCC). Data source: Eurostat Health Interview Survey (EHIS).
Figure A12
Figure A12
Total hospital admissions—Victor Babes University Hospital (2013–2024).
Figure 1
Figure 1
Admission rates 2013–2024 for lung cancer patients. The running average of six periods is presented in red.
Figure 2
Figure 2
Age–sex distributions of lung cancer patients: Males–females distribution during two time intervals separated by the COVID-19 pandemic (2013–2020), (2022–2024).
Figure 3
Figure 3
Pathology assessment cohort age distribution: females–males.
Figure 4
Figure 4
t-SNE visualization of five-cluster solution.
Figure 5
Figure 5
Smoking-intensity distribution by cluster.
Figure 6
Figure 6
Age distribution by cluster. Pairwise comparisons using Mann–Whitney U tests showed significant age differences between most clusters: Cluster 0 vs. 1 (p < 0.01), Cluster 0 vs. 2 (p < 0.001), Cluster 0 vs. 3 (p < 0.001), Cluster 1 vs. 2 (p < 0.001), Cluster 1 vs. 3 (p < 0.001), Cluster 1 vs. 4 (p < 0.01), Cluster 2 vs. 3 (p < 0.001), and Cluster 2 vs. 4 (p < 0.001). Only Cluster 0 vs. 4 (p = 0.235) and Cluster 3 vs. 4 (p = 0.057) were not statistically significant.

Similar articles

References

    1. Zhou J., Xu Y., Liu J., Feng L., Yu J., Chen D. Global burden of lung cancer in 2022 and projections to 2050: Incidence and mortality estimates from GLOBOCAN. Cancer Epidemiol. 2024;93:102693. doi: 10.1016/j.canep.2024.102693. - DOI - PubMed
    1. Bray F., Laversanne M., Sung H., Ferlay J., Siegel R.L., Soerjomataram I., Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2024;74:229–263. doi: 10.3322/caac.21834. - DOI - PubMed
    1. ANALIZA-SITUATIE-CANCER-2022.pdf—Institutul Național de Sănătate Publică. 21 June 2022. [(accessed on 3 April 2025)]; Available online: https://insp.gov.ro/download/analiza-situatie-cancer-2022-pdf/
    1. [(accessed on 3 June 2023)]; Available online: https://insp.gov.ro/download/cnsisp/Fisiere-de-pe-site-CNSISP/mortalitat....
    1. Raport_ss_23_final.pdf. [(accessed on 4 July 2025)]. Available online: https://www.dsptimis.ro/promovare/raport_ss_23_final.pdf.

LinkOut - more resources