Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 24:5:1672364.
doi: 10.3389/fradi.2025.1672364. eCollection 2025.

Integrating clinical indications and patient demographics for multilabel abnormality classification and automated report generation in 3D chest CT scans

Affiliations

Integrating clinical indications and patient demographics for multilabel abnormality classification and automated report generation in 3D chest CT scans

Theo Di Piazza et al. Front Radiol. .

Abstract

The increasing number of computed tomography (CT) scan examinations and the time-intensive nature of manual analysis necessitate efficient automated methods to assist radiologists in managing their increasing workload. While deep learning approaches primarily classify abnormalities from three-dimensional (3D) CT images, radiologists also incorporate clinical indications and patient demographics, such as age and sex, for diagnosis. This study aims to enhance multilabel abnormality classification and automated report generation by integrating imaging and non-imaging data. We propose a multimodal deep learning model that combines 3D chest CT scans, clinical information reports, patient age, and sex to improve diagnostic accuracy. Our method extracts visual features from 3D volumes using a visual encoder, textual features from clinical indications via a pretrained language model, and demographic features through a lightweight feedforward neural network. These extracted features are projected into a shared representation space, concatenated, and processed by a projection head to predict abnormalities. For the multilabel classification task, incorporating clinical indications and patient demographics into an existing visual encoder, called CT-Net, improves the F1 score to 51.58, representing a + Δ 6.13 % increase over CT-Net alone. For the automated report generation task, we extend two existing methods, CT2Rep and CT-AGRG, by integrating clinical indications and demographic data. This integration enhances Clinical Efficacy metrics, yielding an F1 score improvement of + Δ 14.78 % for the CT2Rep extension and + Δ 6.69 % for the CT-AGRG extension. Our findings suggest that incorporating patient demographics and clinical information into deep learning frameworks can significantly improve automated CT scan analysis. This approach has the potential to enhance radiological workflows and facilitate more comprehensive and accurate abnormality detection in clinical practice.

Keywords: 3D CT scans; abnormality classification; clinical indications; multimodal; patient demographics; report generation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Overview of the method. The input volume is processed by a visual extractor ΦV [either CT-Net [9] or CT-ViT [19]] and FV, which generates a visual embedding. Clinical indication is processed by RadBERT [20], yielding a token-level embedding. The [CLS] token is fed into a lightweight MLP FT to project textual and visual features into a common latent space. Patient age and sex information are processed by another lightweight MLP FA,S. These vectors are concatenated, and the resulting vector is passed to a classification head Ψ, which predicts an abnormality score for each label.
Figure 2
Figure 2
Overview of the multimodal dataset. (a) Bar plot of label frequency. (b) Bar plot of sex frequency. (c) Distribution of age in years. (d) Distribution plot of reports’ lengths based on token count using the RadBERT tokenizer.
Figure 3
Figure 3
Integration of clinical indications and patients demographics for the CT-AGRG method. Features derived from the 3D CT volume, clinical indications, patient age, and sex are aggregated to form vector e. This vector is fed into 18 classification heads (one per abnormality). If a classification head predicts an abnormality, the corresponding vector representation is passed to a pretrained GPT-2 model, which generates a textual description of the detected abnormality.
Figure 4
Figure 4
Variation in the F1 score across anomalies, highlighting the impact of integrating patient demographics and clinical indications into the multilabel abnormality classification task.
Figure 5
Figure 5
Comparison of (a) F1 score and (b) AUROC across clinical indications, patient demographics, 3D CT volume, and multimodal fusion for multilabel abnormality classification. The highest performance is achieved when fusing all modalities, highlighting the benefit of multimodal integration.
Figure 6
Figure 6
F1 score improvements across four abnormality groups, when incorporating demographic and clinical indication information into 3D CT volume report generation.
Figure 7
Figure 7
Comparison of ground-truth labels with the report generated by the CT-AGRG model with and without the integration of clinical indications and patient demographics. For each of the two CT-RATE test set examples, we present an axial slice, clinical indications, demographic information, ground truth, and the generated report. Clinical relevance is highlighted using color-coded annotations.

References

    1. Singh SP, Wang L, Gupta S, Goli H, Padmanabhan P, Gulyás B. 3D deep learning on medical images: a review. Sensors (Basel). (2020) 20:5097. 10.3390/s20185097 - DOI - PMC - PubMed
    1. Jany B, Welte T. Pleural effusion in adults—etiology, diagnosis, and treatment. Deutsches Ärzteblatt Int. (2019) 116:377–86. 10.3238/arztebl.2019.0377 - DOI - PMC - PubMed
    1. Dela Cruz CS, Tanoue LT, Matthay RA. Lung cancer: epidemiology, etiology, and prevention. Clin Chest Med. (2011) 32:605–44. 10.1016/j.ccm.2011.09.001 - DOI - PMC - PubMed
    1. Amin H, Siddiqui WJ. Cardiomegaly. In: StatPearls. Treasure Island, FL: StatPearls Publishing (2024). Available online at: https://www.ncbi.nlm.nih.gov/books/NBK542296/ (Accessed June 25, 2024).
    1. Goergen SK, Pool FJ, Turner TJ, Grimm JE, Appleyard MN, Crock C, et al. Evidence-based guideline for the written radiology report: methods, recommendations and implementation challenges. J Med Imaging Radiat Oncol. (2013) 57:1–7. 10.1111/jmiro.2013.57.issue-1 - DOI - PubMed

LinkOut - more resources