Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug;73(8):693-701.
doi: 10.1002/cyto.a.20583.

Statistical mixture modeling for cell subtype identification in flow cytometry

Affiliations

Statistical mixture modeling for cell subtype identification in flow cytometry

Cliburn Chan et al. Cytometry A. 2008 Aug.

Abstract

Statistical mixture modeling provides an opportunity for automated identification and resolution of cell subtypes in flow cytometric data. The configuration of cells as represented by multiple markers simultaneously can be modeled arbitrarily well as a mixture of Gaussian distributions in the dimension of the number of markers. Cellular subtypes may be related to one or multiple components of such mixtures, and fitted mixture models can be evaluated in the full set of markers as an alternative, or adjunct, to traditional subjective gating methods that rely on choosing one or two dimensions. Four color flow data from human blood cells labeled with FITC-conjugated anti-CD3, PE-conjugated anti-CD8, PE-Cy5-conjugated anti-CD4, and APC-conjugated anti-CD19 Abs was acquired on a FACSCalibur. Cells from four murine cell lines, JAWS II, RAW 264.7, CTLL-2, and A20, were also stained with FITC-conjugated anti-CD11c, PE-conjugated anti-CD11b, PE-Cy5-conjugated anti-CD8a, and PE-Cy7-conjugated-CD45R/B220 Abs, respectively, and single color flow data were collected on an LSRII. The data were fitted with a mixture of multivariate Gaussians using standard Bayesian statistical approaches and Markov chain Monte Carlo computations. Statistical mixture models were able to identify and purify major cell subsets in human peripheral blood, using an automated process that can be generalized to an arbitrary number of markers. Validation against both traditional expert gating and synthetic mixtures of murine cell lines with known mixing proportions was also performed. This article describes the studies of statistical mixture modeling of flow cytometric data, and demonstrates their utility in examples with four-color flow data from human peripheral blood samples and synthetic mixtures of murine cell lines.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Trace plots for the proportion of the 5th (top), 10th (middle) and 15th (bottom) largest mixture components ( πi ) over the last 1000 MCMC iterations suggesting convergence.
Figure 2
Figure 2
Plot of the BIC against number of mixture components.
Figure 3
Figure 3
Filtering and identification of mixture components in human peripheral blood. The top row shows the ungated events, while the bottom row shows the mixture components identified, with green for granulocytes, blue for mononuclear cells, red for lymphocytes and maroon for an unclassified component. Mixture components representing aggregates, dead cells and debris in grey are only shown for the FSC/SSC plots in the bottom row. Ellipses and numbered yellow labels on the FSC/SSC plot show the 67% coverage set for each component. Each column is on the same scale.
Figure 4
Figure 4
Using thresholds to increase specificity. The top panel shows the lymphocyte subsets in which events where the posterior probability of belonging to any lymphocyte component falls below 0.95 have been enlarged. The bottom panel shows the result after filtering out the uncertain events. Most of the uncertainty is with the CD8-negative sub-population of the CD3CD8−/dim component, which overlaps the unclassified component.
Figure 5
Figure 5
Lymphocyte subset components identified by statistical mixture modeling in the DAIDS samples from donor T. The component(s) arrowed is the target of interest for that sample. Labels show percentage of events in each component as a fraction of total lymphocytes (same as for Table 2).
Figure 6
Figure 6
Top panel shows the superimposed flow cytometric profiles of 4 mouse cell lines, showing clear deviation from Gaussianity. Bottom panel shows the statistical mixture model fits to the electronically mixed cell line data for 4 different mixtures projected onto the CD11c/CD11b axes. Components sharing a common mode are colored identically. Note that the RAW 264.7 cell line is bimodal – this is true even when fitting a pure RAW 264.7 population alone, and we have therefore used both modes in our calculations.

References

    1. Cormack RM. A Review of Classification. Journal of the Royal Statistical Society. Series A (General) 1971;134:321–367.
    1. Maecker HT, Rinfret A, D'Souza P, Darden J, Roig E, Landry C, Hayes P, Birungi J, Anzala O, Garcia M, et al. Standardization of cytokine flow cytometry assays. BMC Immunol. 2005;6:13. - PMC - PubMed
    1. Tarnok A. A focus on automated recognition. Cytometry A. 2007;71(10):769–770. - PubMed
    1. Robert CP. Mixtures of distributions: Inference and estimation. Markov Chain Monte Carlo in Practice. 1996:441–464.
    1. Titterington D, Smith AFM, Makov U. Statistical Analysis of Finite Mixture Distributions. John Wiley & Sons; 1985.

Publication types