Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025;7(11):1791-1803.
doi: 10.1038/s42256-025-01122-7. Epub 2025 Nov 19.

Deep generative classification of blood cell morphology

Collaborators, Affiliations

Deep generative classification of blood cell morphology

Simon Deltadahl et al. Nat Mach Intell. 2025.

Abstract

Blood cell morphology assessment via light microscopy constitutes a cornerstone of haematological diagnostics, providing crucial insights into diverse pathological conditions. This complex task demands expert interpretation owing to subtle morphological variations, biological heterogeneity and technical imaging factors that obstruct automated approaches. Conventional machine learning methods using discriminative models struggle with domain shifts, intraclass variability and rare morphological variants, constraining their clinical utility. We introduce CytoDiffusion, a diffusion-based generative classifier that faithfully models the distribution of blood cell morphology, combining accurate classification with robust anomaly detection, resistance to distributional shifts, interpretability, data efficiency and uncertainty quantification that surpasses clinical experts. Our approach outperforms state-of-the-art discriminative models in anomaly detection (area under the curve, 0.990 versus 0.916), resistance to domain shifts (0.854 versus 0.738 accuracy) and performance in low-data regimes (0.962 versus 0.924 balanced accuracy). In particular, CytoDiffusion generates synthetic blood cell images that expert haematologists cannot distinguish from real ones (accuracy, 0.523; 95% confidence interval: [0.505, 0.542]), demonstrating good command of the underlying distribution. Furthermore, we enhance model explainability through directly interpretable counterfactual heat maps. Our comprehensive evaluation framework establishes a multidimensional benchmark for medical image analysis in haematology, ultimately enabling improved diagnostic accuracy in clinical settings.

Keywords: Biomedical engineering; Computational models.

PubMed Disclaimer

Conflict of interest statement

Competing interestsP.N. is a co-founder of Hologen, a healthcare generative AI company with a focus on late-stage interventional agent development. M.R. is also a consultant and S.D. is an employee of Hologen. M.R. is co-founder of Octiocor, a company specializing in AI-based analysis of intracoronary imaging. The other authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the diffusion-based classification model.
Representation of the diffusion-based classification process. An input image x0 is first encoded into a latent space using an encoder E. Gaussian noise ϵN(0,I) is then added to create a noisy latent representation zt. This noisy representation is fed through a diffusion model for each possible class condition c. The model predicts the noise ϵθ for each condition. The classification decision is made by selecting the class that minimizes the error between the predicted noise ϵθ and true noise ϵ.
Fig. 2
Fig. 2. Bayesian psychometric analysis of model and expert performance.
Performance was evaluated on our custom CytoData test set (n = 1,000 images). ad, Psychometric functions showing accuracy as a function of a discriminability index. In these panels, data points (black circles) represent the mean accuracy for images binned by confidence, and their size is proportional to the number of trials in each bin. The solid black line is the maximum-likelihood psychometric function fit to the data. The horizontal black error bar on the curve indicates the 95% credibility interval for the function’s threshold, estimated at 80% accuracy (unscaled by lapse and guess rates). The plots in the inset show the joint posterior probability density for the psychometric function’s parameters, width and threshold. a, Psychometric function for CytoDiffusion, with its own confidence score as the discriminability index. b, Psychometric function for a representative human expert (Expert 5), using CytoDiffusion’s confidence score as the discriminability index. c, Psychometric function for the same expert (Expert 5), using expert confidence as the discriminability index. d, Psychometric function for the ViT-B/16 model, with its own confidence score as the discriminability index. e,f, Comparison of psychometric function parameters (width and threshold) across the six human experts. The coloured circles represent the posterior mean of the parameter estimates, and the error bars represent the 95% credibility intervals. Parameters were estimated using either CytoDiffusion confidence (e) or mean expert confidence (f) as the index of signal strength. Source data
Fig. 3
Fig. 3. Anomaly detection and low-data performance comparison.
a, Kernel density estimate figures comparing the anomaly detection performance of ViT-B/16 (top row) with CytoDiffusion (bottom row) for erythroblasts (left and right columns) and blasts (middle column). The horizontal axis represents the normality score, normalized to [0, 1]. The sensitivity (Sens) and specificity (Spec) values show each model’s performance in detecting anomalous cells and correctly classifying normal samples. b, Model performance comparison under low-data conditions across four cytology datasets. The data points represent the mean balanced accuracy, and the shaded areas represent the standard deviation. Statistics were calculated from five independent training sessions. AUC, area under the curve. Source data
Fig. 4
Fig. 4. Counterfactual visualizations for model explainability.
a, An example of generating a counterfactual explanation. Left: original image of an eosinophil. Centre right: counterfactual heat map (Hneutrophil), which highlights areas that would need to change for the model to classify the image as a neutrophil. Far right: an overlay of the thresholded heat map on the original image, localizing the most critical features. b, Matrix of counterfactual heat maps for various cell-type transitions. The diagonal displays original images of each cell type, which serve as the source image for their respective columns. Each off-diagonal element in the same column represents a counterfactual heat map (Hc) showing the transition from the diagonal element (source) to the cell type of that row (target). Areas in the heat map with colours that deviate most from the background indicate regions in which there are large errors in the latent space between the two classes.
Extended Data Fig. 1
Extended Data Fig. 1. Classification confusion matrices.
Confusion matrices showing CytoDiffusion’s classification performance across datasets. (a) CytoData comparison: left matrix shows CytoDiffusion results, right shows average human expert performance (where each expert was evaluated against a consensus ground truth derived from all other experts). CytoDiffusion’s performance on Bodzas (b), PBC (c), Raabin-WBC Test-A (d).

References

    1. Bain, B. J. Blood Cells: A Practical Guide (John Wiley & Sons, 2021).
    1. Kratz, A. et al. Digital morphology analyzers in hematology: ICSH review and recommendations. Int. J. Lab. Hematol.41, 437–447 (2019). - DOI - PubMed
    1. Buttarello, M. & Plebani, M. Automated blood cell counts: state of the art. Am. J. Clin. Pathol.130, 104–116 (2008). - DOI - PubMed
    1. van de Geijn, G.-J. et al. Leukoflow: multiparameter extended white blood cell differentiation for routine analysis by flow cytometry. Cytometry A79A, 694–706 (2011). - DOI - PubMed
    1. Metter, G. E. et al. Morphological subclassification of follicular lymphoma: variability of diagnoses among hematopathologists, a collaborative study between the repository center and pathology panel for lymphoma clinical studies. J. Clin. Oncol.3, 25–38 (1985). - DOI - PubMed

LinkOut - more resources