Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jul;1(3):241-249.
doi: 10.4161/self.1.3.12876. Epub 2010 Jun 30.

Visual analytics for immunologists: Data compression and fractal distributions

Affiliations

Visual analytics for immunologists: Data compression and fractal distributions

Elena N Naumova. Self Nonself. 2010 Jul.

Abstract

Visual analytics is the science of analytical reasoning that facilitates research through the use of interactive visual interfaces. New techniques of visual analytics are designed to aid the understanding of complex systems versus traditional blind-context rules to explore massive volumes of interrelated data. Nowhere else is visualization more important in analysis than in the emerging fields of life sciences, where amounts of collected data grow increasingly in exponential rates.The complexity of the immune system in immunology makes visual analytics especially important for understanding how this system works. In this context, our effort should be focused on avoiding accurate but potentially misleading use of visual interfaces. The proposed approach of data compression and visualization that reveal structural and functional features of immune responses enhances systemic and comprehensive description and provides the platform for hypothesis generation. Further, this approach can evolve into a powerful visual-analytical tool for prospective and real-time monitoring and can provide an intuitive and interpretable illustration of vital dynamics that govern immune responses in an individual and populations.The undertaken explorations demonstrate the critical role of novel techniques of visual analytics in stimulating research in immunology and other life sciences and in leading us to understanding of complex biological systems and processes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of the T-cell receptor with clonotype defining region with respect to working definitions, hypothesis and main concepts of data collection. The α-T-cell receptor heterodimer (in blue) is generated by a rearrangement process that results in a random section of genetic information inserted in the position that will encode the part of the β-chain (magenta) that contacts the antigen-derived peptide (green spheres). This random piece of genetic material can be identified and all T cells with the same random piece of DNA counted. They are all assumed to be related and the number reflects the expansion of that T cell. This random genetic segment (magenta) defines the clonotype, the primary unit of investigation. The image of the 3-D structure of the clonotypical T-cell receptor (clone JM22)-M1(58-66)-HLA-A2 (PDB1OGA) was created using MacPyMol (DeLano Scientific, LCC).
Figure 2
Figure 2
Relative frequencies of clonotypes plotted in descending order: a three-dimensional view of clonotypic frequencies (A) and a stacked bar-graph (B) for samples collected at two time points (1994 and 2004). An inset illustrates most dominant clonotypes.
Figure 3
Figure 3
Visualization of experimental data after the second step of data compression: rank-frequency summaries of clonotypes distributions for samples collected at two time points (1994 and 2004). To describe properties of the clonotype distribution we assigned each clonotype a rank based on the absolute counts of copies (Rank 1 consists of clonotypes observed as single copies, Rank 2 those observed twice, etc.). By plotting relative frequencies in increasing rank order a power-law-like rank-frequency relationship is revealed. In the inset of each plot, the first steps of computational assessment are depicted: the predicted values are obtained by fitting the linear regression model applied to log-transformed data (shown as a green solid line).
Figure 4
Figure 4
Computational assessment of rank-frequency summaries at two time points. The predicted values were obtained by fitting the linear regression model to log-transformed data, parameters and quality of model's fit (R2 values) are also shown in the graph. R2 values reflect per cent variability explained and indicate a very good, over 90%, fit. Similarly to the insets in Figure 3, Rc indicate the critical inflection points, essential for a good fit. To ease the direct comparisons of predicted curves, the vertical axes use the units of relative frequency with identical ranges. The fitted curves demonstrate that flu-specific Vβ19 repertoire underwent attrition in its low-frequency component.
Figure 5
Figure 5
Visual analytics illustrate the fractal nature of T-cell repertoire responding to influenza in a form of a spiral (A) and as a fractal Mondrian set (B). The color-coded spiral depicts clonotypes starting with singletons as the first branch and progressing up to 10 ranks. A Mondrian set mimic the spiral representation (for the ease of resolution depicts only the first 8 ranks). I named this fractal structure a Mondrian set after painting Composition II in Red, Blue and Yellow, 1930 by Piet Mondrian, a Dutch painter [Pieter Cornelis “Piet” Mondrian, after 1912 Mondrian (1872–1944)].
Figure 6
Figure 6
Predicted Rank Frequency relationship obtained by fitting the linear regression model to log-transformed data for two time points and portrayed as a temporal transformation of a Mondrian set.

References

    1. Thomas JJ, Cook KA. lluminating the path: The research and development agenda for visual analytics. IEEE CS Press; 2005.
    1. Naumova EN, O'Neill E. Proceedings of the Joint Statistical Meetings Section: Statistical Graphics. 2001. Graph, word and whatness: musings on the philosophy of curves.
    1. Wilkinson L, Wills G. The grammar of graphics. New York: Springer; 2005.
    1. Naumov YN, Hogan KT, Naumova EN, Pagel JT, Gorski J. A class I MHC-restricted recall response to a viral peptide is highly polyclonal despite stringent CDR3 selection: implications for establishing memory T cell repertoires in “real-world” conditions. J Immunol. 1998;160:2842–2852. - PubMed
    1. Naumova EN, Gorski J, Naumov YN. Two compensatory pathways maintain long-term stability and diversity in CD8 T cell memory repertoires. J Immunol. 2009;183:2851–2858. - PubMed

LinkOut - more resources