Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Jun;19(e1):e5-e12.
doi: 10.1136/amiajnl-2011-000745.

The role of complementary bipartite visual analytical representations in the analysis of SNPs: a case study in ancestral informative markers

Affiliations

The role of complementary bipartite visual analytical representations in the analysis of SNPs: a case study in ancestral informative markers

Suresh K Bhavnani et al. J Am Med Inform Assoc. 2012 Jun.

Abstract

Objective: Several studies have shown how sets of single-nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications to case-control studies and population genetics. However, most of these studies use dimensionality-reduction methods, such as principal component analysis, or clustering methods that result in unipartite (either subjects or SNPs) representations of the data. Such analyses conceal important bipartite relationships, such as how subject and SNP clusters relate to each other, and the genotypes that determine their cluster memberships.

Methods: To overcome the limitations of current methods of analyzing SNP data, the authors used three bipartite analytical representations (bipartite network, heat map with dendrograms, and Circos ideogram) that enable the simultaneous visualization and analysis of subjects, SNPs, and subject attributes.

Results: The results demonstrate (1) novel insights into SNP data that are difficult to derive from purely unipartite views of the data, (2) the strengths and limitations of each method, revealing the role that each play in revealing novel insights, and (3) implications for how the methods can be used for the analysis of SNPs in genomic studies associated with disease.

Conclusion: The results suggest that bipartite representations can reveal new patterns in SNP data compared with existing unipartite representations. However, the novel insights require multiple representations to discover, verify, and comprehend the complex relationships. The results therefore motivate the need for a complementary visual analytical framework that guides the use of multiple bipartite representations to analyze complex relationships in SNP data.

PubMed Disclaimer

Conflict of interest statement

Competing interests: None.

Figures

Figure 1
Figure 1
A sample bipartite network showing 15 subjects (black and white nodes) and eight SNPs (colored nodes), and their connecting edges representing genotypes 0 (white), 1 (gray), and 2 (black). The nodes are sized on the basis of the sum of the weights of their connecting edges, and laid out using the Kamada–Kawai algorithm, which helps to reveal the relationship between the nodes and the nature of cluster memberships. This figure is produced in colour in the online journal-please visit the website to view the colour figure.
Figure 2
Figure 2
(A) The bipartite network showing the subjects (black and white nodes), ancestry informative marker (AIM) single-nucleotide polymorphisms (SNPs) (colored nodes), and their connecting edges representing genotypes 0 (white), 1 (gray), and 2 (black). (B) The SNP dendrogram was used to determine the boundaries of the SNP, and a similar dendrogram determined the boundaries of the subject clusters. This figure is produced in colour in the online journal-please visit the website to view the colour figure.
Figure 3
Figure 3
(A) The bipartite network without the non-discriminating single-nucleotide polymorphisms (SNPs); (B) the associated heat maps with dendrograms, which were used to determine the boundaries of the SNP and subject clusters. This figure is produced in colour in the online journal-please visit the website to view the colour figure.
Figure 4
Figure 4
(A) The bipartite network with nodes sized based on the betweenness centrality measure; (B) the Circos ideogram showing the relationship of the admixed Utah Americans to the SNPs of both clusters (Utah American and Yoruba African SNP clusters), and the sex of the subjects (outer ring). (The betweenness centrality measure for each node has been multiplied by 10 000 to enable Pajek to display them to the maximum two decimal places.) This figure is produced in colour in the online journal-please visit the website to view the colour figure.

References

    1. Kreuzer K, Massey A. Molecular Biology and Biotechnology: A Guide for Teachers. 3rd edn Washington, D.C: ASM Press, 2007
    1. Lewis CM. Genetic association studies: design, analysis and interpretation. Brief Bioinform 2002;3:146–53 - PubMed
    1. Tang H, Quertermous T, Rodriguez B, et al. Genetic structure, self-identified race/ethnicity, and confounding in case-control association studies. Am J Hum Genet 2005;76:268–75 - PMC - PubMed
    1. Parra FC, Amado RC, Lambertucci JR, et al. Color and genomic ancestry in Brazilians. Proc Natl Acad Sci U S A 2003;100:177–82 - PMC - PubMed
    1. Paschou P, Lewis J, Javed A, et al. Ancestry informative markers for fine-scale individual assignment to worldwide populations. J Med Genet 2010;47:835–47 - PubMed

Publication types

LinkOut - more resources