Automatic particle selection from electron micrographs using machine learning techniques

C O S Sorzano¹, E Recarte, M Alcorlo, J R Bilbao-Castro, C San-Martín, R Marabini, J M Carazo

Affiliations

PMID: 19555764
PMCID: PMC2777658
DOI: 10.1016/j.jsb.2009.06.011

Automatic particle selection from electron micrographs using machine learning techniques

C O S Sorzano et al. J Struct Biol. 2009 Sep.

. 2009 Sep;167(3):252-60.

doi: 10.1016/j.jsb.2009.06.011. Epub 2009 Jun 23.

Authors

C O S Sorzano¹, E Recarte, M Alcorlo, J R Bilbao-Castro, C San-Martín, R Marabini, J M Carazo

Affiliation

¹ Unidad de Biocomputación, Centro Nacional de Biotecnología (CSIC), Campus Universidad Autónoma s/n, 28049 Cantoblanco, Madrid, Spain. coss@cnb.csic.es

PMID: 19555764
PMCID: PMC2777658
DOI: 10.1016/j.jsb.2009.06.011

Abstract

The 3D reconstruction of biological specimens using Electron Microscopy is currently capable of achieving subnanometer resolution. Unfortunately, this goal requires gathering tens of thousands of projection images that are frequently selected manually from micrographs. In this paper we introduce a new automatic particle selection that learns from the user which particles are of interest. The training phase is semi-supervised so that the user can correct the algorithm during picking and specifically identify incorrectly picked particles. By treating such errors specially, the algorithm attempts to minimize the number of false positives. We show that our algorithm is able to produce datasets with fewer wrongly selected particles than previously reported methods. Another advantage is that we avoid the need for an initial reference volume from which to generate picking projections by instead learning which particles to pick from the user. This package has been made publicly available in the open-source package Xmipp.

PubMed Disclaimer

Figures

**Fig. 1**
Original piece of a micrograph with KLH particles and its preprocessed counterpart. Note that the size of the preprocessed image is half the size of the original image. However, it has been rescaled for better visualization.

**Fig. 2**
Coarse polar representation of an image with *N_r* = 8 rings and *N_s* = 16 sectors. The outer ring and one of the sectors have been highlighted.

**Fig. 3**
Left: Internal structure of the ensemble classifier. Several weak classifiers (CWNBC) assign a label to a given input vector. Based on these labels, a final decision is made and a label is assigned to the input vector. Right: From an operational point of view, the ensemble classifier can be seen, like any other classifier, as a black box that is trained on input vectors with known class labels and applied to input vectors with unknown class labels. The training vectors are used to learn the classification rules. The application of these rules to the training data yields correctly classified vectors (like a vector of class 0 classified as class 0, 0 → 0) and incorrectly classified vectors (like a vector of class 0 classified as class 1, 0 → 1).

**Fig. 4**
Structure of the multistage ensemble classifier. Stages are cascaded to refine the previous classification. Each stage is formed by several classifiers in parallel. In the figure P stands for the particle class, NP for the non-particles, and E for the errors (see text for a detailed explanation). Particles, non-particles and errors of the present micrograph engross the training population for the next micrograph.

**Fig. 5**
True Positive Rate and False Positive Rate for the APP algorithm proposed in this article. The True Positive Rate is defined as the number of true particles automatically picked over the number of true particles manually picked by Fabrice Mouche in Zhu et al. (2004). The False Positive Rate is defined as the ratio between the wrongly picked particles (a particle is wrong if it does not belong to the Fabrice Mouche set) and the total number of particles picked.

**Fig. 6**
Sample micrograph with the particles picked in the KLH dataset after learning for 82 micrographs. Note that in this case, the user is not interested in top views (circularly shaped projections).

**Fig. 7**
Sample micrograph with the particles picked after learning for 4 micrographs in the Large T antigen+RPA dataset.

**Fig. 8**
Sample micrograph with the particles picked after learning for 10 micrographs in the Adenovirus dataset. The curved structure in the top-right corner corresponds to the edge of the hole in the carbon grid on which the particles are suspended.

See this image and copyright information in PMC

References

1. Chong CW, Raveendran P, Mukundan R. Translation invariants of zernike moments. Pattern recognition. 2003;36:1765–1773.
1. Efron B, Tibshirani R. An introduction to the bootstrap. Chapman & Hall; Boca Raton, Florida, USA: 1993.
1. Freund Y, Schapire RE. Experiments with a new boosting algorithm. Proc. Intl. Work-shop of Machine Learning; 1996. pp. 148–156.
1. Hall RJ, Patwardhan A. A two step approach for semi-automated particle selection from low contrast cryo-electron micrographs. J Structural Biology. 2004;145:19–28. - PubMed
1. Hand DJ, Yu K. Idiot’s bayes - not so stupid after all? Intl Statistical Review. 2001;69:385–399.

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Automatic particle selection from electron micrographs using machine learning techniques

Affiliation

Automatic particle selection from electron micrographs using machine learning techniques

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources