Anomaly detection via Gumbel Noise Score Matching
- PMID: 39430619
- PMCID: PMC11488619
- DOI: 10.3389/frai.2024.1441205
Anomaly detection via Gumbel Noise Score Matching
Abstract
We propose Gumbel Noise Score Matching (GNSM), a novel unsupervised method to detect anomalies in categorical data. GNSM accomplishes this by estimating the scores, i.e., the gradients of log likelihoods w.r.t. inputs, of continuously relaxed categorical distributions. We test our method on a suite of anomaly detection tabular datasets. GNSM achieves a consistently high performance across all experiments. We further demonstrate the flexibility of GNSM by applying it to image data where the model is tasked to detect poor segmentation predictions. Images ranked anomalous by GNSM show clear segmentation failures, with the anomaly scores strongly correlating with segmentation metrics computed on ground-truth. We outline the score matching training objective utilized by GNSM and provide an open-source implementation of our work.
Keywords: anomaly; categorical; detection; score matching; tabular; unsupervised.
Copyright © 2024 Mahmood, Oliva and Styner.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
References
-
- Aggarwal C. C. (2017). “An introduction to outlier analysis,” in Outlier Analysis (Cham: Springer International Publishing; ), 1–34.
-
- Akoglu L., Tong H., Vreeken J., Faloutsos C. (2012). “Fast and reliable anomaly detection in categorical data,” in Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM '12 (New York, NY: Association for Computing Machinery; ), 415–424.
-
- Austin J., Johnson D. D., Ho J., Tarlow D., van den Berg R. (2021). “Structured denoising diffusion models in discrete state-spaces,” in Advances in Neural Information Processing Systems, eds. M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (New York: Curran Associates, Inc), 17981–17993.
-
- Bergmann P., Fauser M., Sattlegger D., Steger C. (2020). “Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Los Alamitos, CA: IEEE Computer Society; ).
-
- Chen J., Sathe S., Aggarwal C., Turaga D. (2017). “Outlier detection with autoencoder ensembles,” in Proceedings of the 2017 SIAM International Conference on Data Mining (Philadelphia: SIAM; ), 90–98. 10.1137/1.9781611974973.11 - DOI
LinkOut - more resources
Full Text Sources
