Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 26;10(5):e0147222.
doi: 10.1128/spectrum.01472-22. Epub 2022 Aug 16.

A Deep Learning Approach to Capture the Essence of Candida albicans Morphologies

Affiliations

A Deep Learning Approach to Capture the Essence of Candida albicans Morphologies

Van Bettauer et al. Microbiol Spectr. .

Abstract

We present deep learning-based approaches for exploring the complex array of morphologies exhibited by the opportunistic human pathogen Candida albicans. Our system, entitled Candescence, automatically detects C. albicans cells from differential image contrast microscopy and labels each detected cell with one of nine morphologies. This ranges from yeast white and opaque forms to hyphal and pseudohyphal filamentous morphologies. The software is based upon a fully convolutional one-stage (FCOS) object detector, a deep learning technique that uses an extensive set of images that we manually annotated with the location and morphology of each cell. We developed a novel cumulative curriculum-based learning strategy that stratifies our images by difficulty from simple yeast forms to complex filamentous architectures. Candescence achieves very good performance (~85% recall; 81% precision) on this difficult learning set, where some images contain hundreds of cells with substantial intermixing between the predicted classes. To capture the essence of each C. albicans morphology and how they intermix, we used a second technique from deep learning entitled generative adversarial networks. The resultant models allow us to identify and explore technical variables, developmental trajectories, and morphological switches. Importantly, the model allows us to quantitatively capture morphological plasticity observed with genetically modified strains or strains grown in different media and environments. We envision Candescence as a community meeting point for quantitative explorations of C. albicans morphology. IMPORTANCE The fungus Candida albicans can "shape shift" between 12 morphologies in response to environmental variables. The cytoprotective capacity provided by this polymorphism makes C. albicans a formidable pathogen to treat clinically. Microscopy images of C. albicans colonies can contain hundreds of cells in different morphological states. Manual annotation of images can be difficult, especially as a result of densely packed and filamentous colonies and of technical artifacts from the microscopy itself. Manual annotation is inherently subjective, depending on the experience and opinion of annotators. Here, we built a deep learning approach entitled Candescence to parse images in an automated, quantitative, and objective fashion: each cell in an image is located and labeled with its morphology. Candescence effectively replaces simple rules based on visual phenotypes (size, shape, and shading) with neural circuitry capable of capturing subtle but salient features in images that may be too complex for human annotators.

Keywords: Candida albicans; deep learning; fully convolutional one-stage object detection; generative adversarial network; microscopy; morphology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIG 1
FIG 1
(A) Example of a typical DIC image in Labelbox, a learning set creation tool for group annotation. (B) The colors indicate class of morphology. The “artifact” class is used to bound imperfections and technical artifacts in the images, whereas the “unknown” class is used to bound cells for which we could not judge morphology. (C) Labeling of an image that is enriched for pseudohyphae. Note that the overall intensities of our images are not required to be the same. (D) Given the size and complexity of the filamentous forms, we annotated each hypha or pseudohypha using three classes. (D and E) H-start and P-start are intended to label only the “start” of the hypha or pseudohypha. Since junctions are different between hyphae and pseudohyphae, we labeled them with the H- and P-junction classes. Finally we bounded the entire filamentous object with a rectangle; note that such bounding boxes overlap surrounding cells. (F) Our system is built upon the FCOS software with structure unaltered from Tian et al. (56). (G) The Varasana learning set is first partitioned into three components (training, validation, and testing). Then both validation and testing are split into six grades. The grades are ordered from white to hypha. When an image appears first in grade i, it is also included in all grades j > i. The test set contains examples of both wild-type SC5314 and SN148a cells but also a collection of genetic and environmental perturbations that induced abnormal C. albicans morphologies. (H) Using the Varasana learning set, we performed a grid search across several hyperparameters (n = ~30,000 combinations). Boldface indicates the combination of hyperparameters that had the best performance.
FIG 2
FIG 2
Representative images of the ground truth in the Varasana validation set (left), each of which consists of a bounding box around the object and a class label, and the predictions made by the FCOS Candescence (right), which also have a score between 0 and 1 from the softmax layer of the FCOS representing the strength of belief in the classification. (A) We found this image difficult to label, as the objects have both pseudohyphal and hyphal properties. Nevertheless, Candescence recapitulated our classifications. Note that if an object is identified as hypha, the junctions and start are also labeled as H; this is also true for a pseudohypha. This suggests that Candescence was learning to classify not only on the image but also as a function of the predicted labels of the same object. (B) Image containing a diverse collection of classes. Note that we labeled gray-like only by size and “texture” (smaller but oblong like opaque and more gaunt than white cells). Overall, these categories witnessed the highest number of classification errors. (C) Candescence predicted well even for highly dense and diverse images such as this one.
FIG 3
FIG 3
(A) Representative images chosen from the complete collection in Fig. S3 of all false-positive object identifications from Candescence. This set of hallucinations is subdivided by the class label returned by Candescence. “X” indicates that we truly are not able to see an object (a true false positive/hallucination), and “M” indicates that Candescence was indeed correct to predict a bounding box at that location (missed by the human annotators) but its subsequent classification disagrees with our criteria. Subimages lacking an annotation indicate that Candescence correctly identified and classified the object, which represents a human error. (B) Three partial images with false negatives (blind spots) labeled with red boxes. Unannotated cells in panels i to iii were correctly handled by Candescence, and their bounding boxes have been removed for clarity. (i) Example of a recurrent problem where the probability from the softmax is distributed over two or more labels (e.g., white and budding white), presumably because it has difficulty guessing whether they are only touching versus still attached. The shared probability causes both to fall below our threshold τ. (ii) Although Candescence performed well, blind spots arose presumably due to the dense packing of cells. (iii) Our strategy during labeling was to leave cells that were partially outside the field of view unlabeled. Such edge effects cause some problems and are often predicted as unknown. Last, bounding boxes for large filamentous C. albicans cells are sometimes missed, especially if other cells are within the vicinity. (C) Confusion matrix for Candescence. Columns correspond to ground truth classifications, and rows correspond to Candescence predictions.
FIG 4
FIG 4
(A) For each pair of images, the left image is a synthesized image produced by our model, and the right image is the image found from the real training data that is its nearest neighbor under the LPIPS metric. (B) Three separate trajectories. Here, the left- and rightmost images correspond to real images from the training data. The sequence of images at points s1 to s7 are produced by a linear interpolation between the real images in the generator’s latent space in the GAN.
FIG 5
FIG 5
(A) Example of anomaly detection using a RHA1 GOF BCR1-null colony (UID 230 of type 13). The panels at the bottom are enlargements a series of bounding boxes predicted by Candescence with their associated anomaly score. (B) Histogram comparing the distribution of anomaly scores between all images of type 13 versus type 30, a collection of normal-appearing pseudohyphal and hyphal cells. There is a small but statistically significant enrichment of cells from the RHA1 GOF BCR1-null colony with elevated anomaly score differences (Kolmogorov-Smirnoff test, P < 0.01).

References

    1. Perea S, Patterson TF. 2002. Antifungal resistance in pathogenic fungi. Clin Infect Dis 35:1073–1080. doi:10.1086/344058. - DOI - PubMed
    1. Pfaller MA, Diekema DJ, Turnidge JD, Castanheira M, Jones RN. 2019. Twenty years of the SENTRY Antifungal Surveillance Program: results for Candida species from 1997–2016. Open Forum Infect Dis 6:S79–S94. doi:10.1093/ofid/ofy358. - DOI - PMC - PubMed
    1. Vandeputte P, Ferrari S, Coste AT. 2012. Antifungal resistance and new strategies to control fungal infections. Int J Microbiol 2012:713687. doi:10.1155/2012/713687. - DOI - PMC - PubMed
    1. Brown GD, Denning DW, Gow NAR, Levitz SM, Netea MG, White TC. 2012. Hidden killers: human fungal infections. Sci Transl Med 4:165rv13. doi:10.1126/scitranslmed.3004404. - DOI - PubMed
    1. Dambuza IM, Brown GD. 2019. Fungi accelerate pancreatic cancer. Nature 574:184–185. doi:10.1038/d41586-019-02892-y. - DOI - PubMed

Publication types

LinkOut - more resources