Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 26;15(2):e0229620.
doi: 10.1371/journal.pone.0229620. eCollection 2020.

Evaluation of machine learning models for automatic detection of DNA double strand breaks after irradiation using a γH2AX foci assay

Affiliations

Evaluation of machine learning models for automatic detection of DNA double strand breaks after irradiation using a γH2AX foci assay

Tim Hohmann et al. PLoS One. .

Abstract

Ionizing radiation induces amongst other the most critical type of DNA damage: double-strand breaks (DSBs). Efficient repair of such damage is crucial for cell survival and genomic stability. The analysis of DSB associated foci assays is often performed manually or with automatic systems. Manual evaluation is time consuming and subjective, while most automatic approaches are prone to changes in experimental conditions or to image artefacts. Here, we examined multiple machine learning models, namely a multi-layer perceptron classifier (MLP), linear support vector machine classifier (SVM), complement naive bayes classifier (cNB) and random forest classifier (RF), to correctly classify γH2AX foci in manually labeled images containing multiple types of artefacts. All models yielded reasonable agreements to the manual rating on the training images (Matthews correlation coefficient >0.4). Afterwards, the best performing models were applied on images obtained under different experimental conditions. Thereby, the MLP model produced the best results with an F1 Score >0.9. As a consequence, we have demonstrated that the used approach is sufficient to mimic manual counting and is robust against image artefacts and changes in experimental conditions.

PubMed Disclaimer

Conflict of interest statement

Faramarz Dehghani is editor of PLOS One. This does not alter our adherence to PLOS ONE policies on sharing data and materials.

Figures

Fig 1
Fig 1. Assessment of image qualities.
Diverse types of image qualities are displayed, showing the nuclear (DAPI) and γH2AX (Foci) staining, as well as the final manual markings of the respective image. Five types of anomalies are presented: high background levels or a low signal to noise ratio, labeling artefacts, halos around foci and apoptotic cells.
Fig 2
Fig 2. Summary of the training and validation results of all machine learning models.
A) Visual comparison of different machine learning models with manual classification. All depicted models yielded reasonable results for nuclei with high signal to noise ratio, but increased noise levels lead to over fitting of the SVM and cNB model. B) and C) F1 score and Matthews correlation coefficient of all models. The SVM, cNB and most models combined with those yielded worse classification results compared to the remaining models. Box plots: the central mark corresponds to the median, the boxes show the 25th and 75th percentile, whiskery indicate the most extreme points not considered outliers.
Fig 3
Fig 3. Summary of the classification results for images obtained under different experimental conditions.
A) Visual comparison of different machine learning models with manual classification. All depicted models yielded reasonable results for nuclei with high signal to noise ratio, but increased noise levels lead an increase in the false negative rate of the RF and MLP+RF model. Red arrows show false negative foci, while white circles depict false positives. B)—E) F1 score, PPV, Sensitivity and FNR of all models. The RF and MLP+RF model displayed a reduced F1 score and sensitivity, but an increased PPV and FNR compared to the other models. The remaining models were indistinguishable from each other. Error bars depict 95% confidence intervals.
Fig 4
Fig 4. Summary of the results for the dimensionality reduction approach.
A) and B) Box plots of the results for the F1 Score and MCC. Using only small or small and medium filter sizes lowers the goodness of classification. Eliminating potentially redundant filter types does not strongly impact classification results. Box plots: the central mark corresponds to the median, the boxes show the 25th and 75th percentile, whiskery indicate the most extreme points not considered outliers.

Similar articles

Cited by

References

    1. van Gent DC, Hoeijmakers JH, Kanaar R. Chromosomal stability and the DNA double-stranded break connection. Nat Rev Genet. 2001;2(3):196–206. 10.1038/35056049 - DOI - PubMed
    1. Sancar A, Lindsey-Boltz LA, Unsal-Kaçmaz K, Linn S, Ünsal-Kaçmaz K, Linn S. Molecular Mechanisms of Mammalian DNA Repair and the DNA Damage Checkpoints. Annu Rev Biochem. 2004;73(1):39–85. - PubMed
    1. Suzuki K, Ojima M, Kodama S, Watanabe M. Radiation-induced DNA damage and delayed induced genomic instability. Oncogene. 2003;22(45):6988–93. 10.1038/sj.onc.1206881 - DOI - PubMed
    1. Rodrigue A, Lafrance M, Gauthier M-C, McDonald D, Hendzel M, West SC, et al. Interplay between human DNA repair proteins at a unique double-strand break in vivo. EMBO J. 2006;25(1):222–31. 10.1038/sj.emboj.7600914 - DOI - PMC - PubMed
    1. Rogakou EP, Pilch DR, Orr AH, Ivanova VS, Bonner WM. DNA Double-stranded Breaks Induce Histone H2AX Phosphorylation on Serine 139. J Biol Chem. 1998;273(10):5858–68. 10.1074/jbc.273.10.5858 - DOI - PubMed

Publication types