Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 22;17(2):e1008630.
doi: 10.1371/journal.pcbi.1008630. eCollection 2021 Feb.

Rapid 3D phenotypic analysis of neurons and organoids using data-driven cell segmentation-free machine learning

Affiliations

Rapid 3D phenotypic analysis of neurons and organoids using data-driven cell segmentation-free machine learning

Philipp Mergenthaler et al. PLoS Comput Biol. .

Abstract

Phenotypic profiling of large three-dimensional microscopy data sets has not been widely adopted due to the challenges posed by cell segmentation and feature selection. The computational demands of automated processing further limit analysis of hard-to-segment images such as of neurons and organoids. Here we describe a comprehensive shallow-learning framework for automated quantitative phenotyping of three-dimensional (3D) image data using unsupervised data-driven voxel-based feature learning, which enables computationally facile classification, clustering and advanced data visualization. We demonstrate the analysis potential on complex 3D images by investigating the phenotypic alterations of: neurons in response to apoptosis-inducing treatments and morphogenesis for oncogene-expressing human mammary gland acinar organoids. Our novel implementation of image analysis algorithms called Phindr3D allowed rapid implementation of data-driven voxel-based feature learning into 3D high content analysis (HCA) operations and constitutes a major practical advance as the computed assignments represent the biology while preserving the heterogeneity of the underlying data. Phindr3D is provided as Matlab code and as a stand-alone program (https://github.com/DWALab/Phindr3D).

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Overview and validation of the Phindr3D method.
In a cell segmentation-independent method, Phindr3D iteratively performs two steps on 3D multichannel images. It calculates and categorizes image histograms at different hierarchical levels to compute data-driven image features using unsupervised clustering. Pixels are binned into pixel categories using k-means clustering and supervoxels (SV) are generated by combining pixels from neighboring z-slices. Next, SVs are combined into megavoxels (MV) and MVs are categorized. Each image stack is then defined by the normalized frequency of these different MV categories.
Fig 2
Fig 2. 3D HCA of concentration response of cancer cell lines to autophagic flux-arresting chloroquine treatment.
The analysis was based on immunofluorescence staining of the autophagy marker p62 (Autophagy positive cells). Concentration response curves generated using classifiers for each cell line trained on control (C) and the highest concentration for each. A) Result of the Phindr3D analysis. For the Phindr3D analysis, each original multichannel multistack image was split into 16 equal parts to increase the number of data points for training in Phindr3D. The classifier was built using Phindr3D features extracted from multichannel raw images. B) Replot of original analysis: Supervised classification using a traditional cell segmentation-based algorithm from PerkinElmer. These data including EC50 and Z’ values were provided by PerkinElmer and are shown here with permission for comparison of the two analysis methods. Details on the generation of this data set have been published elsewhere [13]. Together, these data demonstrate that data-driven cell segmentation-free phenotyping using the Phindr3D method can achieve similar accuracy in 3D data to traditional cell segmentation-based supervised classification using curated feature sets.
Fig 3
Fig 3. Distinct phenotypic responses in dense neuronal cultures.
A) Representative images of neurons from control groups. B) Two dimensional principal component projection of 3D multichannel image feature space for each plate. Each point represents one 3D multichannel image. Only plate 1 contained the 30 nM concentration of the inhibitors of the anti-apoptotic proteins while plates 2 and 3 contain more concentrations for the STS and ActD controls (tan-brown dots, respectively) making the distributions appear more different than they actually are. C) Automated clustering using affinity propagation. The pie charts mark the cluster centroids and show the distribution of different treatments within each cluster. The sizes of the pie charts represent the number of data points within each cluster, cluster numbers are given above the pie charts. D) Heatmap of the distribution of phenotypes in response to the different treatments across clusters for each plate provide interpretative data complementary to but different from the pie charts in B) that show the distributions within clusters. ActD–Actinomycin D, STS–Staurosporine; for B & C: the colors correspond to the different treatment groups as indicated in the legend for concentrations in nM.
Fig 4
Fig 4. Concentration response and aggregate multivariate phenotypic response of neurons treated with BH3-mimetics.
A) 2-dimensional PCA representation of the image feature space for neurons treated with BH3 mimetics (arrows, concentration increase) showing averaged treatment centroids colored by drug and concentration. B) Heatmap of the proportion of images of neurons treated with the indicated drugs assigned to individual clusters. ActD–Actinomycin D, STS–Staurosporine; for A: the colors correspond to the different treatment groups as indicated in the legend for concentrations in nM.
Fig 5
Fig 5. Cluster estimation and workflow for organoid region identification from large image stacks.
A) Automated estimation of the preference value resulted in AP clustering generating seven clusters for the organoid dataset. A preference value that routinely results in a biologically meaningful number of clusters for micrographs of cells was automatically determined as the point at which the number of clusters begins to increase exponentially. This method differs from the standard method of setting the preference value in affinity propagation using median preference [17]. B) Total number of organoids of each type that were analyzed from images pooled from 3 replicate plates for MCF10A cells expressing each oncogene. All of the organoids that were well separated and at an appropriate height that sufficient planes were in focus were analyzed. At least 100 organoids were analyzed for each oncogene with a maximum number of less than 200. This level of variability in object number is unlikely to skew the clustering results. Consistent with this interpretation, organoids expressing Bcl-2 (with the largest number of organoids) and Bcl-XL (a protein of similar function with many less organoids) were found primarily in three clusters each, two of which were in common (Fig 6D). C) Identification of clusters containing normal acinar structures. Both parental and empty vector expressing cells resulted in heterogeneous organoid structures less than half of which were of normal morphology. This heterogeneity appears to be the result of spontaneous transformation of these premalignant cells. To determine which clusters were enriched in normal acinar structures we manually identified normal acinar structures by consensus for parental and empty vector samples. Manual identification was performed by inspection of 114 organoids by experts using only the nuclear channel. For each organoid top, middle and bottom z-planes were examined for roundness and a hollow interior. Then the distribution of normal mammary acinar organoids across clusters was determined. Cluster 1 contained both the largest number (45) of these normal organoids that account for 17% of the organoids in that cluster, therefore, it is the most likely cluster to represent normal structures. The next most representative of a relatively normal phenotype is cluster 2 containing 22 organoids, which account for 10% of the cluster. In contrast, cluster 7 contained 51% of the organoids expressing PIK3CAH1047R which account for 27% of the cluster and were also visually the most abnormal, suggesting this cluster represents an abnormal phenotype (Table A in S1 Text). D) Identification of organoids in a large image stack. First, multistack images of stained nuclei (DRAQ5) were used for organoid identification. Second, Focus Stacking was performed to flatten the image stack. For Focus Stacking, the pixel standard deviation in a 3 x 3 kernel was calculated for every pixel in the original image. Then for every pixel in the standard deviation image the gradient in an 11 x 11 pixels neighborhood was computed. The underlying assumption is that the higher the gradient magnitude for any pixel the better focused the image is. For every X-Y pixel location we identified the corresponding Z plane pixel with the highest gradient value. Finally, for each X-Y position in the image the original intensity of the pixel at the Z location of the best-focus was projected on a 2D plane to create a best-focus image that was used for further processing. Next, to identify organoid regions in 2D, we used contour segmentation by thresholding and watershed transformation. To identify the optically acceptable top and bottom planes for each organoid, a density curve from a normalized histogram of pixel value distribution within the organoid region in the best-focus image was calculated and the number of in focus pixels per Z-plane was used as a focus score (see graph). For n planes, the random chance of any plane being in focus is 1/n. Therefore, any Z-plane with a fraction of best-focus pixels more than 1/n was scored as sufficiently in focus for analysis. We set this value as the threshold and then arbitrarily added 2 more planes above and below the threshold. The numbers highlighted in red indicate the planes automatically chosen for a single organoid for 3D analysis from a gallery of Z-planes from the DRAQ5 channel for the single organoid shown to the right of the graph. The histogram shows the focus score for the 80 Z-planes recorded for the 2D location of the organoid. Red points in the histogram indicate the top and bottom planes automatically chosen for analysis (see Materials and Methods section for details).
Fig 6
Fig 6. Oncogene-driven morphological alterations in organoids.
A) Representative organoid images from each cluster with transgene and dyes indicated. B) 2-dimensional Sammon projection of feature space for all organoids, C) clustering result and D) heat map of proportion of transgene expressing MCF10A organoids in each cluster. E) Clustering after extraction of 2D morphological features results in low information separation of the data. Morphological features assessed included 2D Area, Major Axis Length, Minor Axis Length, Eccentricity, Equivalent Diameter, Solidity, Extent, Convex Area, and Perimeter from maximum-projected 2D regions. As above, affinity propagation was used to clustering the 2D morphological features. Phindr3D methods for determining meaningful cluster numbers by affinity propagation clustering were used as in panels a-c) and resulted in six distinct clusters for the 2D data. However, clusters 2 and 3 and to a lesser extent cluster 1 contained the majority of organoids while only PIK3CAH1047R-expressing organoids formed a distinct cluster (cluster 6). F) After 508 3D texture features were calculated from the nuclear and organoid areas identified using the 3D segmentation algorithm in CellProfiler 3.0.0, the data were clustered by affinity propagation using the Phindr3D method for determining meaningful cluster numbers. Clustering again resulted in six distinct clusters, with clusters 2, 3, 4 and 6 containing large proportions of the organoids. As in D) PIK3CAH1047R-expressing organoids formed a distinct cluster (cluster 5). Empty Vector-expressing organoids were present in all clusters except cluster 3 in similar proportions. G) Computation time for 3D feature extraction was measured on a high performance laptop computer. Feature extraction alone required ~2 hours with Phindr3D and ~8 hours with CellProfiler. In addition, CellProfiler required reformatting the data files to TIFF stacks, which took ~3 hours. H) The utility of the extracted features for clustering to identify unique morphologies within the data was quantified using mutual information (MI). Low MI (i.e. values closer to 0) indicates that clustering using 2D morphological features results poorly predicted allocation of organoids while high MI indicates high probability that clustering using 3D features accurately reflects real differences in the data for images of oncogene-expressing organoids driving allocation to specific clusters. For Sammon plot (B,C): colors indicate transgenes (from D-F), each data point represents one 3D image stack, cluster centers are shown as pie charts indicating the proportion of treatment points within each cluster. The diameter of the pie charts represents the number of data points in that cluster with lines linking data points to cluster.

Similar articles

Cited by

References

    1. Caicedo JC, Singh S, Carpenter AE. Applications in image-based profiling of perturbations. Curr Opin Biotechnol. 2016;39:134–42. 10.1016/j.copbio.2016.04.003 - DOI - PubMed
    1. Oppermann S, Ylanko J, Shi Y, Hariharan S, Oakes CC, Brauer PM, et al.. High-content screening identifies kinase inhibitors that overcome venetoclax resistance in activated CLL cells. Blood. 2016;128(7):934–47. 10.1182/blood-2015-12-687814 - DOI - PMC - PubMed
    1. Eliceiri KW, Berthold MR, Goldberg IG, Ibanez L, Manjunath BS, Martone ME, et al.. Biological imaging software tools. Nat Methods. 2012;9(7):697–710. 10.1038/nmeth.2084 - DOI - PMC - PubMed
    1. Boutros M, Heigwer F, Laufer C. Microscopy-Based High-Content Screening. Cell. 2015;163(6):1314–25. 10.1016/j.cell.2015.11.007 - DOI - PubMed
    1. Shamir L, Delaney JD, Orlov N, Eckley DM, Goldberg IG. Pattern recognition software and techniques for biological image analysis. PLoS Comput Biol. 2010;6(11):e1000974. 10.1371/journal.pcbi.1000974 - DOI - PMC - PubMed

Publication types

MeSH terms