Diffusion Models in Vision: A Survey
- PMID: 37030794
- DOI: 10.1109/TPAMI.2023.3261988
Diffusion Models in Vision: A Survey
Abstract
Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion process, step by step. Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens, i.e., low speeds due to the high number of steps involved during sampling. In this survey, we provide a comprehensive review of articles on denoising diffusion models applied in vision, comprising both theoretical and practical contributions in the field. First, we identify and present three generic diffusion modeling frameworks, which are based on denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. We further discuss the relations between diffusion models and other deep generative models, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing flows. Then, we introduce a multi-perspective categorization of diffusion models applied in computer vision. Finally, we illustrate the current limitations of diffusion models and envision some interesting directions for future research.
Similar articles
-
Diffusion models in medical imaging: A comprehensive survey.Med Image Anal. 2023 Aug;88:102846. doi: 10.1016/j.media.2023.102846. Epub 2023 May 23. Med Image Anal. 2023. PMID: 37295311 Review.
-
Diffusion Models in Low-Level Vision: A Survey.IEEE Trans Pattern Anal Mach Intell. 2025 Jun;47(6):4630-4651. doi: 10.1109/TPAMI.2025.3545047. Epub 2025 May 7. IEEE Trans Pattern Anal Mach Intell. 2025. PMID: 40031863
-
Diffusion models in bioinformatics and computational biology.Nat Rev Bioeng. 2024 Feb;2(2):136-154. doi: 10.1038/s44222-023-00114-9. Epub 2023 Oct 27. Nat Rev Bioeng. 2024. PMID: 38576453 Free PMC article.
-
Semi-Implicit Denoising Diffusion Models (SIDDMs).Adv Neural Inf Process Syst. 2023 Dec;36:17383-17394. Epub 2024 May 30. Adv Neural Inf Process Syst. 2023. PMID: 39130612 Free PMC article.
-
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models.IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):7327-7347. doi: 10.1109/TPAMI.2021.3116668. Epub 2022 Oct 4. IEEE Trans Pattern Anal Mach Intell. 2022. PMID: 34591756 Review.
Cited by
-
Accurate and efficient insulator maintenance: A DETR algorithm for drone imagery.PLoS One. 2025 Feb 25;20(2):e0318225. doi: 10.1371/journal.pone.0318225. eCollection 2025. PLoS One. 2025. PMID: 39999207 Free PMC article.
-
Closing the Domain Gap: Can Pseudo-Labels from Synthetic UAV Data Enable Real-World Flood Segmentation?Sensors (Basel). 2025 Jun 6;25(12):3586. doi: 10.3390/s25123586. Sensors (Basel). 2025. PMID: 40573473 Free PMC article.
-
Text-to-image models reveal specific color-emotion associations.Front Psychol. 2025 Jun 13;16:1593928. doi: 10.3389/fpsyg.2025.1593928. eCollection 2025. Front Psychol. 2025. PMID: 40584075 Free PMC article.
-
A paired CT and MRI dataset for advanced medical imaging applications.Data Brief. 2025 Jun 10;61:111768. doi: 10.1016/j.dib.2025.111768. eCollection 2025 Aug. Data Brief. 2025. PMID: 40655994 Free PMC article.
-
Comprehensive Review: Machine and Deep Learning in Brain Stroke Diagnosis.Sensors (Basel). 2024 Jul 4;24(13):4355. doi: 10.3390/s24134355. Sensors (Basel). 2024. PMID: 39001134 Free PMC article. Review.
LinkOut - more resources
Full Text Sources
Other Literature Sources