Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Oct 15;23(20):2741-6.
doi: 10.1093/bioinformatics/btm443. Epub 2007 Sep 10.

A genotype calling algorithm for the Illumina BeadArray platform

Affiliations

A genotype calling algorithm for the Illumina BeadArray platform

Yik Y Teo et al. Bioinformatics. .

Abstract

Motivation: Large-scale genotyping relies on the use of unsupervised automated calling algorithms to assign genotypes to hybridization data. A number of such calling algorithms have been recently established for the Affymetrix GeneChip genotyping technology. Here, we present a fast and accurate genotype calling algorithm for the Illumina BeadArray genotyping platforms. As the technology moves towards assaying millions of genetic polymorphisms simultaneously, there is a need for an integrated and easy-to-use software for calling genotypes.

Results: We have introduced a model-based genotype calling algorithm which does not rely on having prior training data or require computationally intensive procedures. The algorithm can assign genotypes to hybridization data from thousands of individuals simultaneously and pools information across multiple individuals to improve the calling. The method can accommodate variations in hybridization intensities which result in dramatic shifts of the position of the genotype clouds by identifying the optimal coordinates to initialize the algorithm. By incorporating the process of perturbation analysis, we can obtain a quality metric measuring the stability of the assigned genotype calls. We show that this quality metric can be used to identify SNPs with low call rates and accuracy.

Availability: The C++ executable for the algorithm described here is available by request from the authors.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
A typical clusterplot of the allelic hybridization signals for a SNP: (a) after normalization; (b) after transformation of the same data to yield the contrast-scale coordinates. Each point in the figures represent the intensity data for an individual.
Fig. 2
Fig. 2
Clusterplot for a SNP with shifted genotype clusters. Points in grey represent the observed signal data and the black ellipses represent the expected positions of the three genotype clouds.
Fig. 3
Fig. 3
The clusterplot of a typical SNP on the Illumina array which yields highly homogeneous signals for the homozygous clusters, resulting in significantly peaked variance profiles for the homozygous clusters. Lines in black represent the kernal densities of the observed data (in grey).
Fig. 4
Fig. 4
Clusterplots of a SNP which has been typed on both genomic and whole-genome amplified DNA. Points in black correspond to samples with genomic DNA while points in grey correspond to samples with amplified DNA. The plots are made on the (a) normalized allelic signal coordinates; (b) strength-contrast transformed coordinates.

References

    1. Affymetrix Inc BRLMM: an improved genotype calling method for the GenChip Human Mapping 500K Array Set. 2006. http://www.affymetrix.com/support/technical/whitepapers/brlmm_whitepaper....
    1. Bolstad BM, et al. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. - PubMed
    1. Carvalho B, et al. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics. 2007;8:485–499. - PubMed
    1. Di X, et al. Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays. Bioinformatics. 2005;21:1958–1963. - PubMed
    1. Gudmundsson J, et al. Genome-wide association study identifies a second prostate cancer susceptibility variantat 8q24. Nat. Genet. 2007;39:631–637. - PubMed

Publication types

MeSH terms