A multimodal neural network with gradient blending improves predictions of survival and metastasis in sarcoma

Anthony Bozzo^{1

2}, Alex Hollingsworth³, Subrata Chatterjee³, Aditya Apte⁴, Jiawen Deng⁵, Simon Sun⁶, William Tap⁷, Ahmed Aoude⁸, Sahir Bhatnagar⁹, John H Healey¹⁰

Affiliations

¹ Orthopaedic Service of the Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA. anthony.bozzo.med@ssss.gouv.qc.ca.
² Division of Orthopaedic Surgery, McGill University, Montreal, QC, Canada. anthony.bozzo.med@ssss.gouv.qc.ca.
³ AI/ML and NextGen Analytics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁴ Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁵ Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
⁶ Musculoskeletal Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁷ Medical Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁸ Division of Orthopaedic Surgery, McGill University, Montreal, QC, Canada.
⁹ Department of Epidemiology and Biostatistics, McGill University, Montreal, QC, Canada.
¹⁰ Orthopaedic Service of the Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA.

PMID: 39237726
PMCID: PMC11377835
DOI: 10.1038/s41698-024-00695-7

A multimodal neural network with gradient blending improves predictions of survival and metastasis in sarcoma

Anthony Bozzo et al. NPJ Precis Oncol. 2024.

. 2024 Sep 5;8(1):188.

doi: 10.1038/s41698-024-00695-7.

Authors

Anthony Bozzo^{1

2}, Alex Hollingsworth³, Subrata Chatterjee³, Aditya Apte⁴, Jiawen Deng⁵, Simon Sun⁶, William Tap⁷, Ahmed Aoude⁸, Sahir Bhatnagar⁹, John H Healey¹⁰

Affiliations

¹ Orthopaedic Service of the Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA. anthony.bozzo.med@ssss.gouv.qc.ca.
² Division of Orthopaedic Surgery, McGill University, Montreal, QC, Canada. anthony.bozzo.med@ssss.gouv.qc.ca.
³ AI/ML and NextGen Analytics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁴ Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁵ Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada.
⁶ Musculoskeletal Radiology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁷ Medical Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
⁸ Division of Orthopaedic Surgery, McGill University, Montreal, QC, Canada.
⁹ Department of Epidemiology and Biostatistics, McGill University, Montreal, QC, Canada.
¹⁰ Orthopaedic Service of the Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY, USA.

PMID: 39237726
PMCID: PMC11377835
DOI: 10.1038/s41698-024-00695-7

Abstract

The objective of this study is to develop a multimodal neural network (MMNN) model that analyzes clinical variables and MRI images of a soft tissue sarcoma (STS) patient, to predict overall survival and risk of distant metastases. We compare the performance of this MMNN to models based on clinical variables alone, radiomics models, and an unimodal neural network. We include patients aged 18 or older with biopsy-proven STS who underwent primary resection between January 1st, 2005, and December 31st, 2020 with complete outcome data and a pre-treatment MRI with both a T1 post-contrast sequence and a T2 fat-sat sequence available. A total of 9380 MRI slices containing sarcomas from 287 patients are available. Our MMNN accepts the entire 3D sarcoma volume from T1 and T2 MRIs and clinical variables. Gradient blending allows the clinical and image sub-networks to optimally converge without overfitting. Heat maps were generated to visualize the salient image features. Our MMNN outperformed all other models in predicting overall survival and the risk of distant metastases. The C-Index of our MMNN for overall survival is 0.77 and the C-Index for risk of distant metastases is 0.70. The provided heat maps demonstrate areas of sarcomas deemed most salient for predictions. Our multimodal neural network with gradient blending improves predictions of overall survival and risk of distant metastases in patients with soft tissue sarcoma. Future work enabling accurate subtype-specific predictions will likely utilize similar end-to-end multimodal neural network architecture and require prospective curation of high-quality data, the inclusion of genomic data, and the involvement of multiple centers through federated learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1. Gradient Blending.**
This diagram indicates the relative weighting of modality losses that are contributed to the overall loss over the course of training. The relative weightings are initially uniform and are adjusted every 5 epochs according to their relative overfitting to generalization ratios.

**Fig. 2**
Smoothed ROC curves and calibration plots for our MMNN in predicting overall survival and the risk of distant metastases.

**Fig. 3. Heat maps using the GradCAM method on our test set.**
Representative T2 fat sat axial slices of four test set patients in our study (patients which were never encountered during model training) are displayed. The corresponding heat map from the same patient was pulled from the image subnetwork of the MMNN model. The merged image is provided. In all cases, the model deemed pixels within the tumor volume as most relevant. a Patient predicted to have very low risk of death and metastases, survived with no development of metastases over a 10-year follow-up period. (Low predicted risk, model correct). b Patient predicted to have high risk of death and metastases, perished shortly after developing metastases 1.2 years after surgery. (High predicted risk, model correct). c Patient predicted to have high risk of metastases, did not develop metastases in 3.8 years of follow-up (High predicted risk, model wrong). d Among patients who developed metastases in our test set, this patient had the lowest predicted risk. The model was correct in all other predictions indicating a lower risk of distant metastases. (Low-intermediate predicted risk, model incorrect since patient developed metastases two years after surgery).

**Fig. 4. Architecture of our multimodal neural network model.**
A deep neural network (A) will interpret the 11 clinical variables and a 2-channel convolutional neural network (DenseNet-121) analyzes the MRI input (B). Image features from T1 and T2 MRI sequences are extracted by the convolutional neural network and this information is concatenated along with the features extracted from the clinical variables. Analysis of the combined feature set is used to predict the risk of distant metastases and overall survival. Gradient blending is used to moderate the weight updates between modalities. Dashed lines are used to indicate connections that are only present during training to facilitate Gradient Blending. **1A: Clinical Subnetwork Model**. A deep neural network is implemented to extract features from a vector of clinical variables corresponding to the patient. Numbers under the linear layers correspond to the number of output features for those linear layers. The clinical model extracts 12 features that will be used for the multimodal prediction. **1B: Image Subnetwork Model**. T1 post contrast and T2 fat-sat MRI sequences are concatenated along the channel dimension prior to being fed through a 2-channel DenseNet-121 model. Twelve features are extracted for use in the multimodal prediction. The numbers in each dense block correspond to the number of dense layers within that dense block. The architecture presented is representative of a 3-dimensional, 2-channel densenet-121 with 12 output neurons. Because the model is being used as a feature extractor rather than a classifier, the size of the output layer is a tunable parameter and not limited to the number of predictions made by the multimodal output head. **1C: Dense Block**– Dense blocks consists of a series of dense layers. Within each dense block, the resolution of the feature map is constant. This allows all dense layers within a dense block to contain feed-forward bypass connections to every other dense layer in that dense block. These features are concatenated at the input of each dense layer. **Transition layers** are placed between dense blocks. Transition layers use 1x1x1 convolutions to act as channel pooling layers, reducing the number of feature maps by a factor of 2. In addition, stride 2 average pooling layers are used which reduce the resolution in all spatial dimensions by a factor of 2.

See this image and copyright information in PMC

References

1. Gamboa, A. C., Gronchi, A. & Cardona, K. Soft‐tissue sarcoma in adults: An update on the current state of histiotype‐specific management in an era of personalized medicine. CA: Cancer J. Clin.70, 200–229 (2020). - PubMed
1. Gronchi, A. et al. Histotype-tailored neoadjuvant chemotherapy versus standard chemotherapy in patients with high-risk soft-tissue sarcomas (ISG-STS 1001): an international, open-label, randomised, controlled, phase 3, multicentre trial. Lancet Oncol.18, 812–822 (2017). 10.1016/S1470-2045(17)30334-0 - DOI - PubMed
1. Weitz, J. R., Antonescu, C. R. & Brennan, M. F. Localized extremity soft tissue sarcoma: improved knowledge with unchanged survival over time. J. Clin. Oncol.21, 2719–2725 (2003). 10.1200/JCO.2003.02.026 - DOI - PubMed
1. Schneider, P. & Ghert, M. Surveillance AFter Extremity Tumor surgerY (SAFETY): A Protocol for an International Randomized Controlled Trial. (2018).
1. Wilson, D. A. et al. Designing a rational follow-up schedule for patients with extremity soft tissue sarcoma. Ann. Surg. Oncol.27, 2033–2041 (2020). 10.1245/s10434-020-08240-z - DOI - PubMed

Grants and funding

P30 CA008748/CA/NCI NIH HHS/United States

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A multimodal neural network with gradient blending improves predictions of survival and metastasis in sarcoma

Affiliations

A multimodal neural network with gradient blending improves predictions of survival and metastasis in sarcoma

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Research Materials