Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 24:7:1441205.
doi: 10.3389/frai.2024.1441205. eCollection 2024.

Anomaly detection via Gumbel Noise Score Matching

Affiliations

Anomaly detection via Gumbel Noise Score Matching

Ahsan Mahmood et al. Front Artif Intell. .

Abstract

We propose Gumbel Noise Score Matching (GNSM), a novel unsupervised method to detect anomalies in categorical data. GNSM accomplishes this by estimating the scores, i.e., the gradients of log likelihoods w.r.t. inputs, of continuously relaxed categorical distributions. We test our method on a suite of anomaly detection tabular datasets. GNSM achieves a consistently high performance across all experiments. We further demonstrate the flexibility of GNSM by applying it to image data where the model is tasked to detect poor segmentation predictions. Images ranked anomalous by GNSM show clear segmentation failures, with the anomaly scores strongly correlating with segmentation metrics computed on ground-truth. We outline the score matching training objective utilized by GNSM and provide an open-source implementation of our work.

Keywords: anomaly; categorical; detection; score matching; tabular; unsupervised.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Correlations with segmentation metrics for Top-K = 50 anomaly scores retrieved from GNSM and Deep SVDD. The arrows next to the metric denote the expected correlation direction. The magnitude of the correlations reflects how well the anomaly scores capture segmentation errors.
Figure 2
Figure 2
Random samples from Top-K = 50 GNSM rankings. Note how the predicted segmentations are either partial/missing or include incorrect classes. The columns (repeated twice) show input image, ground truth segmentations, and model predictions respectively. Different classes are denoted by color. The VOC data includes images obtained from Flickr: https://www.flickr.com/.
Figure 3
Figure 3
Random samples from Top-K = 50 DSVDD rankings. Note how only a few predictions may be considered anomalous. The VOC data includes images obtained from Flickr: https://www.flickr.com/.

References

    1. Aggarwal C. C. (2017). “An introduction to outlier analysis,” in Outlier Analysis (Cham: Springer International Publishing; ), 1–34.
    1. Akoglu L., Tong H., Vreeken J., Faloutsos C. (2012). “Fast and reliable anomaly detection in categorical data,” in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12 (New York, NY: Association for Computing Machinery; ), 415–424.
    1. Austin J., Johnson D. D., Ho J., Tarlow D., van den Berg R. (2021). “Structured denoising diffusion models in discrete state-spaces,” in Advances in Neural Information Processing Systems, eds. M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (New York: Curran Associates, Inc), 17981–17993.
    1. Bergmann P., Fauser M., Sattlegger D., Steger C. (2020). “Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Los Alamitos, CA: IEEE Computer Society; ).
    1. Chen J., Sathe S., Aggarwal C., Turaga D. (2017). “Outlier detection with autoencoder ensembles,” in Proceedings of the 2017 SIAM International Conference on Data Mining (Philadelphia: SIAM; ), 90–98. 10.1137/1.9781611974973.11 - DOI

LinkOut - more resources