AUCReshaping: improved sensitivity at high-specificity

Sheethal Bhat^{1

2}, Awais Mansoor³, Bogdan Georgescu³, Adarsh B Panambur^{4

5}, Florin C Ghesu³, Saahil Islam^{4

5}, Kai Packhäuser⁴, Dalia Rodríguez-Salas⁴, Sasa Grbic³, Andreas Maier⁴

Affiliations

¹ Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany. sheethal.bhat@siemens-healthineers.com.
² Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany. sheethal.bhat@siemens-healthineers.com.
³ Digital Technology and Innovation, Siemens Medical Solutions, Princeton, NJ, 08540, USA.
⁴ Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany.
⁵ Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany.

PMID: 38036602
PMCID: PMC10689839
DOI: 10.1038/s41598-023-48482-x

AUCReshaping: improved sensitivity at high-specificity

Sheethal Bhat et al. Sci Rep. 2023.

. 2023 Nov 30;13(1):21097.

doi: 10.1038/s41598-023-48482-x.

Authors

Affiliations

¹ Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany. sheethal.bhat@siemens-healthineers.com.
² Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany. sheethal.bhat@siemens-healthineers.com.
³ Digital Technology and Innovation, Siemens Medical Solutions, Princeton, NJ, 08540, USA.
⁴ Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg, 91058, Erlangen, Germany.
⁵ Digital Technology and Innovation, Siemens Healthineers, Erlangen, Germany.

PMID: 38036602
PMCID: PMC10689839
DOI: 10.1038/s41598-023-48482-x

Abstract

The evaluation of deep-learning (DL) systems typically relies on the Area under the Receiver-Operating-Curve (AU-ROC) as a performance metric. However, AU-ROC, in its holistic form, does not sufficiently consider performance within specific ranges of sensitivity and specificity, which are critical for the intended operational context of the system. Consequently, two systems with identical AU-ROC values can exhibit significantly divergent real-world performance. This issue is particularly pronounced in the context of anomaly detection tasks, a commonly employed application of DL systems across various research domains, including medical imaging, industrial automation, manufacturing, cyber security, fraud detection, and drug research, among others. The challenge arises from the heavy class imbalance in training datasets, with the abnormality class often incurring a considerably higher misclassification cost compared to the normal class. Traditional DL systems address this by adjusting the weighting of the cost function or optimizing for specific points along the ROC curve. While these approaches yield reasonable results in many cases, they do not actively seek to maximize performance for the desired operating point. In this study, we introduce a novel technique known as AUCReshaping, designed to reshape the ROC curve exclusively within the specified sensitivity and specificity range, by optimizing sensitivity at a predetermined specificity level. This reshaping is achieved through an adaptive and iterative boosting mechanism that allows the network to focus on pertinent samples during the learning process. We primarily investigated the impact of AUCReshaping in the context of abnormality detection tasks, specifically in Chest X-Ray (CXR) analysis, followed by breast mammogram and credit card fraud detection tasks. The results reveal a substantial improvement, ranging from 2 to 40%, in sensitivity at high-specificity levels for binary classification tasks.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Figure 1**
The effective interval delineates the region of practical significance within the ROC curve, specifically the area characterized by a False Positive Rate of less than 20% denoted as the effective interval. Enhancements beyond this region have negligible bearing on the practical performance of a commercial classification system. The region of interest denotes, the specific points on the curve where AUCReshaping is applied.

**Figure 2**
The schematic provides an overview of the experimental methodology, highlighting the AUC reshaping function’s role in reshaping a specific portion of the ROC curve, included in the “region of interest.” The ROC figures depicted in the diagram illustrate the adjustments made in the high-specificity region, with a slight potential decrease in the overall AUC value, resulting from modifications in the remaining parts of the curve.

**Figure 3**
Illustration of the process of the AUCReshaping() function during fine-tuning. In each iteration, the function is applied to increase the weights of the misclassified samples at the high-specificity threshold. This process is repeated at multiple thresholds with different weighting values. (a) shows the optimal threshold that separates positive (blue) samples from negative (red) samples. (b) demonstrates a high-specificity threshold that aims to reduce the misclassifications of negative (red) samples. (c) represents the re-weighting of high-specificity misclassified positive samples (blue), which increases the uncertainty in the model’s predictions.

**Figure 4**
The original ROC curve of an SSL system (shown in orange) is transformed by AUCReshaping, resulting in the new ROC curve (depicted in red). While the final AUC score may experience a marginal reduction, it’s evident that this transformation leads to higher sensitivity at high-specificity levels.

**Figure 5**
Example Chest X-Rays, sourced from the internal dataset CXR_16k, display pleural effusion abnormalities identified by bounding boxes.

**Figure 6**
Breast mammogram images featuring bounding boxes highlighting calcification and a mass abnormality, as sourced from the VinDr-Mammo dataset.

See this image and copyright information in PMC

References

1. Hassan, M. U., Rehmani, M. H. & Chen, J. Anomaly detection in blockchain networks: A comprehensive survey. In IEEE Communications Surveys & Tutorials (2022).
1. Tang, Y.-X., Tang, Y.-B., Han, M., Xiao, J. & Summers, R. M. Abnormal chest x-ray identification with generative adversarial one-class classifier. In 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), 1358–1361 (IEEE, 2019).
1. Shvetsova N, Bakker B, Fedulova I, Schulz H, Dylov DV. Anomaly detection in medical imaging with deep perceptual autoencoders. IEEE Access. 2021;9:118571–118583. doi: 10.1109/ACCESS.2021.3107163. - DOI
1. Bozorgtabar, B., Mahapatra, D., Vray, G. & Thiran, J.-P. Salad: Self-supervised aggregation learning for anomaly detection on x-rays. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 468–478 (Springer, 2020).
1. Bogdoll, D., Nitsche, M. & Zöllner, J. M. Anomaly detection in autonomous driving: A survey. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4488–4499 (2022).

MeSH terms

Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

AUCReshaping: improved sensitivity at high-specificity

Affiliations

AUCReshaping: improved sensitivity at high-specificity

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

LinkOut - more resources

Full Text Sources

Medical